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Windows DNA 





Windows 2000 is Web-to-the-core. 

You've got major-league stuff here. The new Microsoft Windows 
2000 operating system is at the foundation of Windows DNA. 
This new OS now has a complete application server to develop 
and run serious enterprise-level Web apps. Its built-in middleware 
includes component services (COM+), Transaction Services (MTS), 
Message Queue Services (MSMQ), Internet Information Services 
(IIS), and an integrated XML parser. All members of the Windows 
DNA team work well together, including SQL Server™ 7.0, SNA Server 
4.0, Site Server Commerce Edition 3.0, and Visual Studio. 





Develop the apps you need for the Business Internet right now. 

This is a five-alarm fire. They want Web apps. Everything from supply chain 
integration to sales tracking. And they want them yesterday. Good thing you're 
ready. With Windows® DNA, your experience with Microsoft® Windows and the 
Visual Studio® 6.0 development system has prepared you to develop those 
big, killer, Web apps faster than with any other platform. Using the skills you 
already have, you're well on your way to meeting the demands of the Business 
Internet right now. 


INTRODUCING WINDOWS DNA 


The Web development platform you already know. 








XML and the new era of Web development. 

The launch sequence has begun. Your mission: 1) To take 
applications where they’ve never been before. 2) To lead the 
Internet beyond just HTML browsing, to programming the Web 
via XML. 3) To integrate business processes using XML, through 
BizTalk, which lowers costs and speeds development. Houston, 
we have a program: the Windows DNA platform with XML. 





MSDN provides comprehensive Windows 
DNA guidance. 

MSDN is the number one resource for developers. 
It provides intensive care with specs, testing, 
events, procedures, and even second opinions. 
For help with Windows DNA, the easy and fast 
way to build serious Web apps, visit 

MSDN. microsoft.com/windowsdna 
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Windows DNA is the leading platform for building the Business Internet. When these companies 
needed to build serious, reliable Web applications, Windows® DNA delivered big time. In fact, Microsoft® 
Windows DNA was chosen as the Web platform by over half of the top 50 shopping sites* To learn more, 
visit msdn.microsoft.com/windowsdna 
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Poseidonius used the position of the star Canopus to measure the 
earth’s circumference around 10 B.C. When Columbus used that 


calculation to chart his course to the Indies some 1,500 years later, he 


ended up off-target, over-budget and late. Though adequate for shorter 
journeys, it wasn’t the best tool for a complicated, long-range voyage. 








sing a relational database management system To learn more about how Objectivity’s object database 
with object-orientated (OO) programming can can help you exploit the object-related benefits in your 
severely undermine the benefits of OO. The costs of application, contact us for a free copy of our white 
mixing object-to-relational programming are high and paper, Accelerating Your Object-Oriented 
often negate the benefits of OO programming, such as Development. 
flexibility, reuse and simplicity. Using an object To get your copy, and discover a whole new world, visit 


our web site: 


database with OO programming can ‘ ¢ 8 
) | - Objectivity or call (800) 767-6259. 


reduce your code by 30%. 
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GJ: A GENERIC JAVA 
by Philip Wadler 





Generic Java (GJ) adds generic types to the Java language. GJ is compatible with Java, the Java 
Virtual Machine, and existing libraries. It is also efficient, in that information about generic types is 


maintained only at compile time, not run time. 


COLLABORATIVE APPLICATIONS AND THE JAVA SHARED DATA TOOLKIT 32 


by Joshua Fox 


The Java Shared Data Toolkit is designed to help you write distributed collaborative applications so 
that groups of users can work simultaneously on a common task. 


JAVA REFERENCES 


by Jonathan Amsterdam 


Java lets a program refer to objects without preventing those objects from being garbage collected. 
Jonathan explains how references work and presents useful abstractions that make working with 


them easier. 


PYTHON SERVER PAGES: PART II 
by Kirby W. Angell 


Last month, Kirby introduced Python Server Pages and looked at how HTML pages with 
embedded scripts are translated into compilable JPython code. This month, he examines the 


Java Servlet side of PSP. 


JAVA, XML, & LITERATE PROGRAMMING 
by Andrew Dwelly 


Marius, the system Andrew presents here, implements some of Donald Knuth’s ideas about literate 
programs, but uses Java as its programming language, with HTML as the output. In the process, 


Marius leverages the power of XML. 


OPENCARD FRAMEWORK APPLICATION DEVELOPMENT 10 


by Vesna Hassler and Oliver Fodor 


The Personal Computer/Smart Card Interface (PC/SC) and OpenCard Framework emphasize the 
interoperability of smartcards and card terminals, and the integration of those card terminals into 


Microsoft Windows. 
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EMBEDDED SYSTEMS 


THE REAL-TIME SPECIFICATION FOR JAVA 18 
by David Hardin 

The Real-Time Specification for Java promises to bring the benefits of Java to real-time 

developers. David examines the requirements and design decisions that led to the Real-Time 
Specification for Java, and provides practical examples of its use. 


INTERNET PROGRAMMING 


WEBRELAY: A MULTITHREADED HTTP RELAY SERVER 86 
by Peter Zhang 

Webrelay is a freely available multithreaded HTTP relay server that authenticates that clients 

are legitimate users before they are connected to vendor web servers. 
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VISUALIZING NETWORK RESOURCES USING VISIO 98 
by Chris Trueman 

Visio is a generic diagram construction tool that just happens to include a 

powerful visualization engine. Chris uses that engine to write a C++ tool that 
generates diagrams to represent all the available resources on a Windows 

network. 


COLUMNS 


PROGRAMMING PARADIGMS 105 
by Michael Swaine 


You won't need to go to the state of “iDenmark” to know that something 
smells. And you can bet Michael nose what he’s talking about this month. 


C PROGRAMMING 110 
by Al Stevens 

Al resurrects $, a homebrew C variant he implemented a decade ago — back 

before the days of Javascript and VB. His updated version is written in C++, 

and includes a shell program that tests the interpreter by loading and executing 

text source-code files written in the S language. 


JAVA Q&A 115 
by Ethan Henry and Ed Lycklama 

Our authors show what you can do when Java exhibits classic memory leak 
behavior— unbounded memory growth leading to poor performance and 

eventually crashing. 


ALGORITHM ALLEY 123 
by Michael J. Wiener 

Michael presents some key optimizations (with source-code examples) that can 

be made to make RSA algorithm as fast as possible. 


DR. ECCO’S OMNIHEURIST CORNER 131 
by Dennis E. Shasha 

Landmines are a nasty piece of work, indeed. Ecco and Liane need to come up 

with ways to make removing them a safer proposition. 


PROGRAMMER’S BOOKSHELF 135 
Gregory V. Wilson 

This month, Greg examines Mastering Algorithms with Perl, by Jon Orwant, 

Jarkko Hietaniemi, and John Macdonald, Programming for the Java Virtual 

Machine, by Joshua Engel, Java for Engineers and Scientists, by Stephen J. 

Chapman, /ntroductory Java for Scientists and Engineers, by Richard Davies, 

the C++ Toolkit for Scientists and Engineers, by James T. Smith, Quantum 

Computing and Communications, by Michael Brooks, and Steven Roman's 

Learning Word Programming. 
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As a service to our readers, source code and 
related files, and author guidelines are available 
at http://www.ddj.com/. Source code is also 
available via anonymous FTP from ftp.ddj.com 
(199.125.85.76). Letters to the editor, article 
proposals/submissions, and inquiries can be sent 
to editors@ddj.com, faxed to 650-358-9749, or 
mailed to Dr. Dobb’s Journal, 411 Borel Ave., 
Suite 100, San Mateo, CA 94402-3522. 

For subscription questions, change of address, 
and orders, call 800-456-1215 (U.S. or Canada). 
For all other countries, call 303-678-8475 or fax 
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P.O. Box 56188, Boulder, CO 80322-6188. 
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copy (includes shipping and handling). For issue 
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Real-world data management solutions 
are typically more complex when one 
examines the pieces, than initially 
recognized by the majority of database 
programers. All software projects are 
complex puzzles comprised of many 
details, most of which are data-related. 
Often today’s “DBMS” solutions sacrifice 
the speed or control essential for a 
competitive application. 

c-tree Plus®, by FairCom, has been the 
choice of commercial developers for twenty 
years precisely because it offers the 
flexibility and control at the detail level to fit 
a wide variety of data management needs. 
Proven on large Unix servers and 


c-tree Plus® offers the most 
mature ISAM solution today. 


workstations, c-tree Plus’s small footprint 
and exceptional performance have also made 
it the engine of choice for professional 
developers on Windows and Mac. c-tree Plus 
offers sophisticated ISAM level control with 
which the developer may define precise data 
management solutions, making it a perfect 
fit for any development project requiring 
specific data handling features. 


FairCom’s The FairCom 

c-tree Pius Server: 

data base A solid, high performance 

engine: database server that is All <henc 

° Advanced Indexing scalable, portable and offers platforms 
Technology unequalled control. FairCom Supported in one 

* Complete Source Code has been providing database package: 
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a better solution, with these 
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anywhere else! 
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¢ Complete Transaction 
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in Tools.h++ Professional, Threads.h++, and Standard C++ Library, your development team can Y Multithreading classes 


Y Solutions for Java/C++ 
interoperability 


get a head start on building a solid, high-performance foundation for every application. Know 
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EDITORIAL 





Worker Shortage a Tall Tale? 


hen it comes to the high-tech job market, the consensus 
W is that the U.S. is in the midst of an ongoing Information 

Technology (IT) labor shortage. Fueled by industry 
groups that represent large corporations looking for cheap labor, 
studies claim that the current high-tech labor shortage ranges 
anywhere from 300,000 to 600,000 workers, and will explode to 
1.2 million over the next couple of years. 

Of course, this hue and cry isn’t new. It started with a 
September 1997 U.S. Department of Commerce report that was 
followed first by the January 1998 National Information Technology 
Workforce Convocation, then by the February 1998 U.S. Senate 
Judiciary Committee hearings on H-1B visa quotas. (H-1B visas are 
temporary work permits that allow up to 115,000 foreign workers 
to be employed by specific U.S. companies, mainly in high-tech 
industries.) The resulting 
media frenzy created a cottage 
industry of sorts, with oodles 
of web sites, magazines, 
newsletters, consultants, and 
conferences. 

During this period of 
mounting hysteria, the lone 
contrary voice crying in the 
help-wanted wilderness was 
University of California at 
Davis computer-science 
professor Norman Matloff, 
who’s premise is that “there 
is no desperate national 
shortage of computer 
programmers.” Instead, he 
believes, the labor shortage is 
a notion contrived by large 
companies to avoid retraining midcareer (35 years and older) 
programmers, in favor of recent college graduates and foreign 
workers who command lower salaries. Although snubbed at the 
1998 National Information Technology Workforce Convocation, 
Matloff did appear before the Senate Judiciary Committee to 
provide legislators with some balance (not that they listened, 
since he neglected to distribute campaign finance checks at the 
event). Matloff’s testimony, along with scads of other relevant 
information, is available at http://heather.cs.ucdavis.edu/itaa.html. 

Matloff’s premise has gradually garnered support. Even the 
Department of Commerce, whose influential 1997 report was 
largely based on studies conducted by the industry- funded 
Information Technology Association of America (http:// 
www.itaa.org/), has changed its tune. In a June 1999 update, 
Commerce said that computer-science enrollment in U.S. 
universities is not on the decline as originally reported, but 
actually on the increase. More telling, Commerce admitted it 
can’t determine whether an IT labor shortage actually exists at 
all. And the June 1999 report acknowledges that age 
discrimination and other questionable employer hiring practices 
are a factor in the IT job marketplace. 

Adding further fuel to Matloff’s fire is a just-completed report 
by the United Engineering Foundation (http://www 
.uefoundation.org/), a nonprofit corporation chartered for the 
“advancement of the engineering arts and sciences.” (The UEF is 
the successor to the United Engineering Society, which was 
founded in 1904 with the support of Andrew Carnegie.) The 
UEF’s IT Workforce Data Project is a year-long study aimed at 
identifying and disseminating statistics involving U.S. IT workers. 
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Part I, released in January 1999, focused on Core Occupations of 
the U.S. IT Workforce, Part II discussed the Production of U.S. 
Degrees in IT Disciplines, Part III focused on the Foreign Origin 
Persons in the U.S. IT Workforce, and most recently, Part IV 
centered on Assessing the Demand for IT Workers. What makes 
Part IV so interesting— and likely controversial— is its inherent 
contradictions. Specifically, the UEF study concluded that, on 
one hand, there does not appear to be a general national 
shortage of IT workers. On the other hand, it acknowledged that 
many employers can’t find the workers they are looking for. 
Finally, the study points out that there are many qualified (in 
terms of training and experience) IT workers who are having 
difficulty finding jobs. “How can this be?” the UEF asks. 

Well, in answering its rhetorical question, the UEF arrives at 
conclusions similar to Matloff’s. For one thing, there continues to 
be a strong preference in the job market for recent college grads 
who will work more and, of course, cost less. It’s interesting that 
the UEF report also notes the high 
correlation between recent grads and 
H1-B visas. This isn’t surprising, since 

most high-tech H1-B workers are 
recently graduated students from 
U.S. universities. 
But a more disturbing trend 
centers on employer obsession with 
specific skills. Companies aren’t 
looking for “good programmers’— 
they’re looking for Visual Basic, or Java, 
or ColdFusion, or whatever programmers. 
Consequently, many students, and the 
schools they attend, are focusing on 
developing narrowly defined skills, rather 
than the broader fundamentals required 
to be a “good” programmer. The long- 
term negative consequences of this 
approach are obvious—we’'ll end up 
with legions of programmers experienced in tools, technologies, 
and applications that are out of date and out of demand. Even 
worse, those programmers will not be prepared to adapt to new 
skill sets because of the narrow nature of their initial training. And 
naturally, we'll have a managerial mentality predisposed against 
retraining because of short-term costs. Of course, we have that 
situation now, and it just isn’t working— and we can’t count on 
another Y2K-like scenario coming along to resurrect stagnant 
programming careers. 

So is there a IT labor shortage? It depends on who you 
believe, or who you want to believe. You can’t deny the heaping 
help-wanted ads packed with programmer positions, or that 
career recruitment web sites are doing well enough to sponsor 
NFL football games. Nor can you ignore reports of programmers 
who can’t find jobs because they are 40 years old and have MS 
degrees and 20 years of experience. It comes down to whether 
you go with the credibility of an organization like the UEF, 
which has no apparent vested interest, versus that of 
organizations such as the ITAA, which is funded by and for the 
benefit of profit-crazed corporations. You make the call. 


Jonathan Erickson 


editor-in-chief 
jerickson@ddj.com 


http://www.ddj.com 
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Data Structures as Objects 
Dear DDJ, 
I really enjoyed the article “Data Struc- 
tures as Objects,” by Jiri Soukup (DD/J, 
October 1999). However, there was one 
thing that I didn’t quite understand: the 
semantics of Bags and Sets. As I under- 
stood from the article, an object can be 
contained in any number of Bags, but it 
can only be in one Set at a time. Intu- 
itively, I would think the semantics would 
be the exact opposite of this. In mathe- 
matics, objects can be an element of more 
than one set; for instance, 1 is both in 
the set of Natural and the set of Real 
numbers. However, I cannot put my DD/ 
in two physical bags at the same time. If 
the semantics are really the reverse from 
real life bag and set semantics, I would 
like to know why. 

Bart Samwel 

bsamwel@ingr.com 


Jiri responds: Bart, this may have histor- 
ical reasons, but the terms used in the 
data structure field are: set=collection of 
items, where each item can be at most 
once in the set; and bag=collection, 
which can have duplication of points, 
and each point can be in several bags. 
I agree with you that this is counterin- 
tuitive. It also shows the kind of mental 
confusion that still exists about data 
structures. STL defines only Set, and in- 
stead of Bag, STL uses the term “collec- 
tion.” I used the terms in my article as 
references to something most program- 
mers would understand. For the reason 
you mention, I never use them myself — 
and I work with three types of collec- 
tions: direct collection (embedded point- 
ers, set); aggregate=direct collection, 
where each item knows its parent (hier- 
archy); and indirect collection (using in- 
termediary links, bag). 


Real (Netscape) Time 

Dear DDJ, 

After reading Eugene Kim’s review of 
Netscape Time: The Making of a Billion- 
Dollar Start-Up That Took on Microsoft, 
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by Jim Clark (see “Programmer’s Book- 
shelf,” (DDJ, December 1999) I disagree 
with Clark’s invention of the term 
“Netscape Time.” I first ran across the 
term in the November 1995 issue of Fast- 
Company magazine that featured an ar- 
ticle entitled “Can You Work in Netscape 
Time” (see http://www.fastcompany 
.com/online/01/netscape. html). 

Glenn Crist 

gcrist@airmail.net 


Riding the 
XML Bandwagon 
Dear DDJ, 
Regarding “Microsoft Jumps on the XML 
Bandwagon” (“News & Views,” DDJ, De- 
cember 1999): Microsoft has hardly just 
“discovered” XML. While the company 
appears to have a sudden recent inter- 
est in XML with BizTalk and Windows 
2000, it was probably more ahead of the 
game than others. Recalling from mem- 
ory, Microsoft sat on the working group 
to standardize XML, it had an XML pro- 
cessor in the winter of 1998, and an early 
XSL processor prototype released in 
February or March of 1998. Microsoft 
even licensed this XSL processor for use 
by companies such as ArborText. Its IES 
browser has built-in rendering support 
for XML and, I think, the October 1998 
version of XSL. In short, Microsoft isn’t 
late to the XML game; it’s just late to the 
Windows 2000 game. 

Evan Easton 

evan@eeaston.com 


Porting to CE 

Dear DDJ, 

Oliver Diener’s article “Porting Communi- 
cations Software to Windows CE” (DDJ, 
July 1999) was very interesting, but I had 
to comment on the first sentence: “If you’re 
to believe Microsoft, porting a Win32 ap- 
plication to Windows CE is a piece of 
cake, because almost all of the well- 
known Win32 APIs are there.” I have to 
add this to a my collection of trite com- 
ments about Microsoft’s marketing, such 
as: “If you believe Microsoft, Plug and Play 
is perfect...;” “If you believe Microsoft, NT 
is a secure operating system...;” If you be- 
lieve Microsoft, VB is the ideal develop- 
ment language for all tasks...” Let’s be re- 
alistic. DD/s target audience is a bit more 
sophisticated than average, and most of 
us have been in the business long enough 
to distinguish the truth from the [hypel. 
We all know Microsoft’s marketing exag- 
gerates, often to the point of outright false- 
hood. But let’s face it, so does every ma- 
jor firm: If you believe Apple, anyone can 
use a computer effectively with no train- 
ing at all. If you believe Sun, Java apps 
are always portable, with no additional 
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effort on the part of the programmer. (And 
if you believe that, ’ve got some ocean- 
front property in Arizona...) 

The fact is, communications software 
under Win32 is a giant pain for most peo- 
ple, and expecting a painless port to 
WinCE is naive at best. I think we might 
be better off to teach the less-experienced 
coders to distinguish fact from fiction 
rather than crying Pinocchio all the time. 

The real problem isn’t that Microsoft 
marketing exaggerates— every firm does. 
The problem is we aren’t teaching the 
wide-eyed college graduates to look for 
the truth instead of a silver bullet. That’s 
one of the reasons why our industry has 
such an abysmal success record. 

Keep up the good work. Just bear in 
mind that DD/ readers are generally more 
knowledgeable than readers of many oth- 
er developer publications. 

Ron Ruble 

raffles1 @worldnet.att.net 


Nothing New 
About Open Source 
Dear DD/J, 
I had to chuckle when I read Al Stevens’s 
April 1999 “C Programming” column 
where he said, “I really like this kind of 
beta testing; the testers not only find the 
problems, but they fix them too.” Did he 
realize at the time that he was espousing 
the open source litany? As Eric Raymond 
puts it, “Given enough eyeballs, all bugs 
are shallow.” Was it unintentional or is 
Al Stevens subtly promoting open soft- 
ware? Hmm. 

David A. Rogers 

darogers@xnet.com 


Al replies: David, many thanks for 
your whimsical reaction. I’ve always 
been in favor of, and have practiced, 
the free distribution of software source 
code and the free exchange of ideas 
among programmers. Virtually all the 
software I’ve published in the past two 
decades has included source code. 
Whether that fits some trendy institu- 
tional definition of “open software” or 
aligns with the social agenda of some 
organized group of programmers is of 
no concern to me. 


Y2K Worries? 
Dear DDJ, 
I have to agree with Jonathan Erickson 
[when he states] in his June 1999 editori- 
al that the perception of Y2K will be more 
dangerous than the reality. Alarmists ev- 
erywhere are having the time of their lives. 
Good for them. 

As for Jonathan’s concern about his 
bank: You don’t want to go to the gas sta- 
tion to see if your debit or credit card still 
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(continued from page 12) 

works? The grocery store? (Oops, I for- 
got— the shelves will be empty.) I would 
go for just that purpose so I could get in 
on the class-action lawsuit...(was that a 
different article?) Either way, January 1, 
2000, is still a valid business day so 
Jonathan should cut his financial institu- 
tion some slack! 

I'll step off my soapbox for now. Be- 
sides, I have to talk to the backhoe oper- 
ator who is installing my underground 
gasoline storage tank. He’s holding up my 
generator delivery... 

Bruce MacDonald 

macdonb@scottsdaleins.com 


Version Control 

Dear DD/ 

I'd like to compliment Aspi Havewala on 
his article, “The Version Control Process” 
(DDJ, May 1999). I’ve seen too many cas- 
es where the “common sense” of version 
control is either not used or is misap- 
plied. His discussion was clear and con- 
cise, yet dealt with all the important is- 
sues in the process. It is an excellent 
starting point for most development pro- 
jects. I thought the discussion of the on- 
going roles of “version control adminis- 
trator” and “build captain” was especially 


useful, as managers often assume devel- 
opers will just “take care of it” without 
allocating time for administration, which 


can result in chaos and render the 
database useless. 


One area I thought could have been 
improved was the discussion of adding 


object and executable files to the database. 


While I agree that object and executable 
files should not be added, it goes beyond 


the additional burden placed on the de- 
veloper and additional stress placed on 


the database. Adding object and exe- 
cutable files to the database is useless and 
has the potential to destroy the integrity 


of your database. 

Object and executable files are the 
product of more than the source files; 
compilation options, shell environments, 
and tool configuration all play a critical 
role in deciding what object or exe- 
cutable is produced in a build. If you 
have managed to capture all of this in- 


formation, then recreation of the object 
and executables should be trivial at a 


particular checkpoint and it was useless 


to track the intermediate files. If you have 
not captured this information in the ver- 
sion-control system, then you have ob- 
ject and executable files that have no un- 


derstandable relationship to the source 
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code, which means that you have no real 
understanding of what is in your 
database, therefore destroying the in- 
tegrity of your database. Programmers 
don’t object to extra work, they object 
to pointless extra work! 

As an example, in a UNIX environ- 
ment, many programmers rely on the 
ability to tinker with the build environ- 
ment without changing the makefiles or 
src files; for instance, optimization op- 
tions can often easily be changed via the 
shell environment or command line. 
Imagine that a developer changes opti- 
mization setting in this way, perhaps set- 
ting the optimization lower so he can 
use a debugger. If his changes to object 
files are checked into version control, 
another developer could spend hours try- 
ing to figure out why his newly linked 
application (built, in part, with object 
files that were checked into version con- 
trol with lower optimization) is so slow! 

Scott Venckus 

Applied Research Laboratories 

University of Texas, Austin 

svenckus@arlut.utexas.edu 
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HoTMetaL PRO 6.0 
by SoftQuad Software, Inc. 


HoTMetaL PRO 6.0 is the ultimate Web 
development environment. Powerful, 
customizable and extensible, it has all the 
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ERP Can Spell Disaster 

Enterprise Resource Planning (ERP) soft- 
ware has become the trend du jour in the 
business world. Tired of disconnected, 


outdated, and often incompatible appli- — 


cations performing various business func- 
tions, organizations have been looking for 
“integrated solutions,’ which do every- 


thing from inventory to distribution to hu- _ 


man resources and accounts payable. Big 
players in the ERP software and services 
market include the likes of SAP AG, Baan, 
_ Oracle, and PeopleSoft, among others. — 

But many companies are finding out 
the hard way that massive and expen- 
sive application integration is easier said 
than done. In the past few months, sey- 
eral large, visible companies have run 
into serious problems with their ERP 
software. Hershey Foods’ $112 million 
R/3 software from SAP AG fouled up the 
company’s candy, shipments for Hallo- 
ween. Whirlpool’s distribution system for 
its appliances suffered similar problems 
resulting from an SAP R/3 installation. 
Allied Waste Industries cancelled a $130 
million SAP R/3 project, while Waste 
Management did the same after spend- 
ing $45 million of an expected $250 mil- 
lion on a similar project. 

Similar problems struck PeopleSoft, 
which has been sued by W.L. Gore (mak- 
ers of Gore-Tex) for an allegedly botched 
ERP project. And Procter & Gamble’s in- 
ternally developed SourceOne global ERP 
system has been spitting out incorrect data 
and suffering from slow data retrieval 
times. There’s no moral to the story. But 
caveat emptor when the big suits come 
knocking with “integrated solutions.” 


Linux Goes Super 

Linux is hitting the big time, as two su- 
percomputer projects based on the freely 
available OS have been launched. The U.S. 
Department of Energy’s Argonne Nation- 
al Laboratory recently joined forces with 
IBM and VA Linux Systems to build a 512- 
CPU Linux cluster. Called the “Chiba City 
Project” (after the futuristic Chiba City in 
William Gibson’s scifi novel, Neuro- 
mancer), the system will be Argonne’s 
most powerful supercomputer. The clus- 
ter is composed of 256 dual processor 
servers from VA Linux and IBM. A pre- 
pared statement from Argonne said that 
the “cluster installation was accomplished 
in a two-day ‘barn raising’ event, com- 
plete with banjo player. Over fifty Argonne 
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scientists pitched in to help build the clus- 
ter, which links high-performance servers 
from VA Linux with advanced hardware 


from IBM and the latest in network inter- 


connect hardware.” 

Meanwhile, Silicon Graphics (SG) 
demonstrated a Linux-based supercom- 
puter at the Supercomputing Conference 
99 in Portland, Oregon. SGI is trying to 
sell off its Cray supercomputer division, 


opting instead to offer new supercom- | 
puters based on clustered 64-bit Intel IA- 


64 Itanium processors running Linux. 


-Dot-com Companies Bet the Farm 


The wild speculation on Internet— or dot- 
com— company stock shares has appar- 


ently subsided, but it seems that an ad- 


vertising spending spree is just getting into 
gear. Internet companies spent close to $2 


billion in advertising in 1999. Dot-com ad- 


vertisers have been a boon for radio and 
TV stations, not to mention billboard own- 


ers. Amazon.com alone spent $100 mil- 


lion in advertising for the 1999 holiday 
season. 

Amazingly, 20 percent of the available 
commercials for the January 2000 Super- 
bowl were purchased by dot-com com- 
panies. A 30-second commercial costs 
about $2 million. Most of these dot.com 
advertisers have yet to turn a profit. One 


Superbowl advertiser, Computer.com, 


spent $3 million on Super Bowl advertis- 
ing and has generated less than $500,000 


_ in revenue, according to an article in The 


Wall Street Journal. Anthony Perkins, ed- 


itor of Red Herring and coauthor of The. 


Internet Bubble, advises investors to sell 


their Internet stocks and invest in broad 


cast companies. 
Ironically, some advertising agencies ; are 


demanding dot-com companies fork over 


50 percent of the fee up front. 


The Tiniest Transistor 


Engineers at the University of California, 
Berkeley, claim to have created the world’s 


tiniest semiconductor transistor. Called the 


FenFET, the transistor uses a new gate de- 
sign with improved current control and 
reduced current leakage, which allows the 
transistor to be much smaller. Unlike the 
conventional flat conductor gate that con- 
trols only one side of the passage through 
which the current flows in a transistor, the 
UC Berkeley design uses a fork-shaped 
prong that straddles both sides of the cur- 
rent channel. 
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_ UC Berkeley Professor Chenming Hu, 
the leading faculty member on the pro- 
ject, told DD] that the major obstacle to 
shrinking the size of transistors is the 
gate length. If the gate length is too 
small, current leaks across the gate even 
when the transistor is in the off state. 
Conventional transistors have a gate 
length of about 180 nanometers. Profes- 
sor Hu’s team has successfully reduced 
the length to 18 nanometers using the 
prong design, which has solved the cur- 
rent leakage problem. With a factor of 
10 gate-length reduction, silicon chips 
using these new transistors would have 
100 times the capacity of chips using con- 
ventional transistors. Hu says that com-- 
puter simulation tests indicate that the 
gate length can eventually be reduced to 
9 nanometers, which would result in a 
400-fold increase in chip capacity. 
Professor Hu said that there are no 
plans to patent the device and hopes that 
the research will provide a “stimulant to 
industry for trying new approaches to 


_ semiconductor design.” Professor Hu and 


graduate student Xuejue Huang present- 
ed their research in detail at the Interna- 
tional Electron Devices Meeting (http:// 
www.ieee.org/conference/iedm). 


And Speaking of Tiny... 

Yale and Rice University researchers are 
turning to molecular computing to come 
up with an even smaller logic gate. The 
researchers have created an electronic 
switch the size of a single molecule. Elec- 
trical engineers from Yale and chemists 
from Rice joined forces to develop the 


_ molecular switch, which is described in 


the November 19, 1999, issue of Science 
magazine. Professor James Tour of Rice 
designed the synthetic molecule used in 
the switch, while Professor Mark Reed 
of Yale led the effort involving the elec- 
tronics. 

According to a prepared statement 
from both universities, the switch ex- 
hibits on/off ratios of close to 1000: 1 
compared to 50:1 for typical silicon de- 
vices. The designers expect fabrication 
of molecular switches to be much cheap- 
er than semiconductors once the tech- 
nology evolves into practical applica- 
tions, which is estimated to be 5 to 10 
years away. The research was also pre- 
sented at the International Electron De- 
vices Meeting (http://www.ieee.org/ 
conference/iedm), 
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Flat Panel Monitor Sold as an Upgrade. 


NEW DELL® PRECISION™ Workstation 220 

= Dual Intel® Pentium® Ill Processors at 533MHz (up to 733MHz) 
= 64MB PC800 RDRAM at 400MHz (up to 512MB) 

m 13.6GB‘ 7200 RPM EIDE Hard Drive (up to 36GB* 10K SCSI) 

= Dual 17" (16.0" vis) M780 Monitors (up to Flat Panel) 

m= 32MB 4X AGP Matrox G400 Max Graphics Card 

@ Integrated NIC & Sound; 48X Max" Variable CD-ROM 

= Microsoft® Windows NT® Workstation 4.0; 1-Year NT Support 
m 3-Year Next-Business-Day On-site® Service 

m 7X24 Dedicated Hardware Phone and Online Tech Support 


4 Business Lease’: $106/Mo.,36 Mos. 
“ee Ee VALUE CODE: 89509-49013 1b 
m= Upgrade to a 27.2GB* 7200 RPM EIDE Hard Drive, add $100 


= Upgrade to Dual 19" (17.9" vis) P991 Trinitron® Monitors, 
add $510 


AMBITIOUS? FEEL LIKE CREATING SOMETHING THAT WILL ALTER HISTORY? OR MAYBE JUST TAKING OVER AN 
ENTIRE INDUSTRY, OR STARTING A WORLDWIDE MOVEMENT IS MORE YOUR STYLE? HAVE WE GOT A 
DELL® PRECISION™ WORKSTATION FOR YOU. DELL WORKSTATIONS ARE BUILT TO HANDLE YOUR SPECIALIZED 
SOFTWARE APPS LIKE MICROSOFT® VISUAL C++ AND SYMANTEC/VISUAL CAFE. AND WITH DUAL PROCESSOR 
CAPABILITY, YOU'LL HAVE THE POWER TO DO MORE GOOD THAN EVEN YOUR MOTHER IMAGINED. 


DELL® PRECISION™ Workstation 610 

= Up to Dual Intel® Pentium® Ill Xeon™ Processors 
at 550MHz (RAID Capable) 

= 64MB ECC SDRAM (up to 2GB) 

= 9GB‘ up to 36GB* (10,000 RPM) Ultra-2/LVD SCSI 
Hard Drives 

= 32MB Diamond Viper V770D Graphics Card 
(Upgradeable to: 32MB Matrox G400 Max, 
Appian Jeronimo Pro, Diamond Fire GL1 or 
Intense 3D Wildcat 4000) 


STARTING AT 


312 9 wie ( Business Lease’: $105/Mo.,36 Mos. 
—w ExVALUE CODE: 89509-490131 


Common Features: 


NEW DELL® PRECISION™ workstation 420 
= Up to Dual Intel® Pentium® Ill Processors 
from 600MHz-733MHz (RAID Capable) 
= 128MB PC800 RDRAM at 400MHz (up to 1GB) 
= 6.4GB‘ EIDE HD up to 27.2GB‘ EIDE or 9GB* up to 
36GB‘ (10,000 RPM) Ultra 160/m SCSI HDs 
m= 32MB 4X AGP Diamond Viper V770D Graphics Card 
(Upgradeable to: 32MB 4X AGP Matrox G400 Max, 
Appian Jeronimo Pro, Diamond Fire GL1 or 
NEW Intense 3D Wildcat 4110 Pro) 


STARTING AT 


$2999 


am! Business Lease’: $100/Mo.,36 Mos. 
= EaVALUE CODE: 89509-490129 


DELL® PRECISION™ workstation 210 

= Up to Dual Intel® Pentium® Ill Processors 
from 500MHz-700MHz 

= 64MB ECC SDRAM (up to 512MB) 

= 6.4GB‘ EIDE HD up to 27.2GB‘ EIDE or 9GB* up to 
18GB‘ (10,000 RPM) Ultra-2/LVD SCSI HDs 

= 32MB Diamond Viper V770D Graphics Card 
(Upgradeable to 32MB Matrox G400 Max or 
Appian Jeronimo Pro) 


STARTING AT 


$ 1859 _#.: Business Lease’: $62/Mo.,36 Mos. 


E=VALUE CODE: 89509-490118 


= 17" (16.0" vis) M780 Monitor [Upgrades available from 17" (16.0" vis) P780 Trinitron® Monitor to 24" (22.5" vis) and Flat Panel] 
= Multi-Monitor Capability = 48X Max" Variable EIDE CD-ROM Drive ® Integrated SoundBlaster Pro Compatible Sound 
= Integrated 3Com® 10/100 PCI TX NIC with Remote Wakeup ®™ Remote Client Manageability Support via Wakeup on LAN (WuOL) Capable 

= Microsoft® Windows NT® Workstation 4.0 = 3-Year Next-Business-Day On-site’ Service ™" 7X24 Dedicated Telephone and Online Tech Support 


WWW.DELL.COM/SMALLBIZ/DEV & 





pentium:!// 


1.800.433.5898 


BE DIRECT" 





USE THE POWER OF THE E-VALUE™ CODE. 
Match our latest technology with our latest prices. 
Enter the E-VALUE code online or give it to your sales 
rep over the phone. WWW.DELL.COM/EVALUE 





D@LL 


www.dell.com 


Phone Hours: M-F 7a-9p = Sat 10a-6p = Sun 12p-5p CT m= In Canada’, call 800-839-0148 = In Mexico’, call 01-800-021-4531 = GSA Contract #GS-35F-4076D 
Prices not discountable. 'Prices and specifications valid in U.S. only and subject to change without notice. For a complete copy of Guarantees or Limited Warranties, 
write Dell USA L.P,, Attn: Warranties, One Dell Way, Round Rock, TX 78682.°On-site service may be provided by a third-party provider under contract with Dell, and is not 
available in certain areas. Technician will be dispatched if necessary following phone-based troubleshooting. ‘For hard drives, GB means 1 billion bytes; total accessible 
capacity varies depending on operating environment. “Business leasing arranged by Dell Financial Services LP, an independent entity, to qualified customers. Above 
lease payments based on a 36-month lease, and do not include taxes, fees, shipping charges; subject to credit approval and availability. Above lease terms subject to 
change without notice. "20X Min. Intel, the Intel Inside logo and Pentium are registered trademarks; Pentium Ill Xeon and Celeron are trademarks of Intel Corporation. 
MS, Microsoft, BackOffice, IntelliMouse, Windows NT and Windows are registered trademarks of Microsoft Corporation. 3Com is a registered trademark and 
Fast EtherLink is a trademark of 3Com Corporation. Trinitron is a registered trademark of Sony Corporation. ©1999 Dell Computer Corporation. All rights reserved. 
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| Bu ilding setups that incl ude third party tec hnologies is a consta nt battle. \ 





InstallShield Professional 2000 is he peace plan. 


REUSABLE OBJECTS ~~ 
use predefined InstallShield 
objects or create your 


automatically scans your app to 
identify required DLL’s 
and offers choices on whether 
(and where) they 
Should be incorporated. 


COMPONENT 
INSTALL/UNINSTALL 
deliver individual applications 
or suites that can be easily 
changed, removed, or reinstalled 
by the end user. No need for 
custom DLL’s for Uninstallation. 


EVENT-BASED SCRIPTING 
improved script management 
Structure reduces the amount of 
code necessary to create setups, 
So you have more time to 
customize them. 


The code war is over. The newest version of the industry standard for installation development is here: InstallShield 
Professional 2000, Second Edition — delivering breakthrough usability and productivity for developers who create setups of 
any size or complexity, for any Windows platform — including Millennium and Windows 2000. As a developer, your 
most precious commodity is time. And you'll save plenty of it with the powerful new features in InstallShield Professional 
2000; tools that take the grunt work out of building installations by making it easy to support established third-party 
technologies. Plus, only InstallShield has comprehensive Consulting and Training solutions that help you tackle every deploy- 
ment challenge, every time. So don’t waste another minute fighting the installation ; 
war. Visit www.installshield.com/peace or call 800-969-7734 to learn more about InstallShi al( 
InstallShield Professional 2000, download a trial version, or purchase it immediately. ss 


Start using the newest version of the industry standard in setup authoring today. SOFTWARE CORPORATION 


© 2000 InstallShield Software Corporation. All rights reserved. InstallShield is a registered trademark and service mark of InstallShield Corporation. Windows is a registered trademark of Microsoft Corporation. All other trademarks are property of their respective owners. Part No. 211-20035 0999 





Philip Wadler 


he Java programming language may be about to change. 

While the Java system has added and revised libraries at a 

furious pace, the Java language proper has remained frozen 

since the introduction of inner classes a few years ago. Now, 
Sun has invoked the Java Community Process to consider adding 
generics to the Java language. 

Here is the problem that generics solve. Say you wish to pro- 
cess lists. Some may be lists of bytes, others lists of strings, and 
yet others lists of lists of strings. In Java, this is simple. You can 
represent all three by the same class, which has elements of class 
Object, see Table 1(a). 

To keep the language simple, you are forced to do some of 
the work yourself: You must keep track of the fact that you have 
a list of bytes (or strings or lists), and when you extract an ele- 
ment from the list, you must cast it from class Object back to 
class Byte (or String or List) before further processing. For in- 
stance, the collection class framework in Java 2 treats collections 
(including lists) in just this way. 

As Einstein said, everything should be as simple as possible, 
but no simpler. And some might say the above is too simple. If 
we extend the Java language with generic types, then it is pos- 
sible to represent information about lists in a more direct way; 
see Table 1(b). The compiler could now keep track of whether 
you have a list of bytes (or strings or lists), and no explicit cast 
back to class Byte (or String or List<String>) would be required. 
In some ways, this is similar to generics in Ada or templates in 


hitp://www.ddj.com 


C++. It is also similar to parametric polymorphism in so-called 
functional languages such as ML and Haskell. 

Sun’s work is influenced by a number of previous propos- 
als, notably Generic Java (GJ). GJ was designed by Gilad 
Bracha and David Stoutamire at Sun, Martin Odersky at the 
Swiss Federal Institute of Technology, and myself. Bracha is 
now spearheading Sun’s process for adding generics to Java, 
with Odersky and myself sitting on the associated experts 
committee. In this article, I'll describe GJ as a way of looking 
ahead at what may emerge from Sun. Further, GJ details are 
available at http://www.cs.bell-labs.com/~wadler/gj/. The GJ 
compiler, written by Odersky, is available for download from 
the site. Odersky’s GJ compiler is also the basis of Sun’s cur- 
rent Java compiler, although that compiler does not support 
generics (yet). GJ and the GJ compiler are not themselves Sun 
products. 

Some of GJ’s key features include: 


¢ Compatibility with Java. GJ is a superset of Java. Every Java 
program is still legal and retains the same meaning in GJ. 

¢ Compatibility with the Java Virtual Machine (JVM). GJ com- 
piles into JVM code. No change to the JVM is required. Thus, 
GJ runs everywhere Java runs, including in your browser. 





Philip is a researcher at Bell Labs, Lucent Technologies. He can 
be reached at wadler@research.bell-labs.com. 
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e Compatibility with existing libraries. Existing libraries work 
with GJ, even in compiled class binary form. Sometimes it is 
possible to retrofit an old library to have new types, without 
access to the source. For instance, the Java collection class li- 
brary was retrofitted to add generics. 

e Efficiency. Information about generic types is maintained only 
at compile time, not at run time. This means that compiled GJ 
code is pretty much identical to Java code for the same pur- 
pose, and equally efficient. 


The GJ compiler works by translating GJ back to ordinary Java. 
The translation simply erases type parameters and adds casts. For 
instance, it takes the GJ class List<Byte> back into the Java class 
List, and adds casts from Object to Byte where needed. The result 
is much the same Java you would have written if generics weren’t 
available. This is why it is simple to interface GJ with existing Java 
libraries, and why GJ is as efficient as Java. Furthermore, GJ comes 
with a “cast-iron” guarantee: No cast inserted by the compiler will 
ever fail. In effect, the type system is a simple logic that lets the 
compiler prove that the cast is safe. On top of this, since GJ trans- 
lates into JVM byte codes, all of the usual safety and security prop- 
erties of the Java platform are preserved. 


How GJ Works 

Let’s look at two examples of how GJ works. The first covers 
the most basic features of GJ, as used in building and examin- 
ing generic lists. The second covers some more advanced fea- 
tures and shows how to write a generic method to find the largest 
element in a list. In both cases, Ill first consider the Java code 
for a task, then show how it is rewritten in GJ. I'll also describe 
how to retrofit legacy Java libraries to GJ and consider how C++ 
templates compare with GJ generics. 

Example 1 shows list and iterator interfaces (both highly sim- 
plified from the Java collections library), a linked list class that im- 
plements the list interface, and a test class. The list interface pro- 
vides a method to add an element to a list, add, and a method to 
return an iterator for the list, iterator. In turn, the iterator interface 
provides a method to determine if the iteration is done, hasNext, 
and (if it is not) a method to return the next element and advance 
the iterator, next. The linked list class implements the list interface, 
but I'll skip the details of how this is done. The add method ex- 
pects an object, and the next method returns an object. Every class 
is a subclass of Object, so you can form lists with elements of type 
Byte, String, List itself, or any other class. 

The test class builds some lists, then extracts elements from them. 
Users must recall what type of element is stored in a list, and cast 
to the appropriate type when extracting an element from a list. Ex- 
tracting requires two casts. If the user accidentally attempts to ex- 
tract a byte from a list of strings, it raises an exception at run time. 

Example 2 shows lists, iterators, linked lists, and the test class 
rewritten in GJ. The interfaces and class take a type parameter A, 
written in angle brackets, representing the element type. Each 
place where Object appeared in the previous code is now re- 
placed by A. Each place where List, Iterator, or LinkedList ap- 
peared in the previous code is now replaced by List<A>, Itera- 
tor<A>, or LinkedList<A>, respectively. Instead of relying on the 





Ti able 1: Processing lists. ( a) Java; (b) with generic types. 
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user's memory, parameters document the type of each list’s elements, 
and no casts are required. The code to extract an element from a 
list of lists is now more perspicuous. An attempt to extract a byte 
from a list of strings will now indicate an error at compile time. 

In Java, lists are heterogenous— they may have elements of any 
type, and there is no way to enforce that they all have the same 
type. In GJ, lists are homogenous— they must have all elements 
of the same type, and the compiler enforces this. If you really need 
a list with elements of any type, you use List<Object>. 

To translate from GJ into the Java programming language, you 
replace each type by its erasure. The erasure of a parametric 
type is obtained by deleting the parameter (so List<A> erases to 
List), the erasure of a nonparametric type is the type itself (so 


interface List [{ 
public void add (Object x); 
public Iterator iterator (); 
} 
interface Iterator { 
public Object next (); 
public boolean hasNext (); 
} 
class LinkedList implements List { ... } 
class Test { _ 
public static void main (String[] args) { 


// byte list 
List xs = new LinkedList(); 
xs.add(new Byte(@)); xs.add(new Byte(1)); 


Byte x = (Byte)xs.iterator().next(); 


// etring list 

List ys = new LinkedList(); 
ys.add("zero"); ys.add("one") ; 

String y = (String)ys.iterator() .next(); 


// atring list liet 

List zss = new LinkedList(); 

zss.add(ys) ; 

String z = : 
(String) ((List)zss.iterator().next()).iterator().next(); 


// string list treated as byte list 
Byte w = (Byte)ys.iterator().next(); // run-time exception 





Example 1: List and iterator interfaces. 


interface List<A> {_ 
public void add(A x); 
public Iterator<A> iterator(); 
} 
interface Iterator<A> { 
public A next(); 
public boolean hasNext () ; 
J 
class LinkedList<A> implements List<A> { ... } 
class Test { 
public static void main (String[] args) [ 


// byte list 
List<Byte> xs = new LinkedList<Byte>(); 


xs.add(new Byte(@)); xs.add(new Byte(1)); 
Byte x = xs.iterator().next(); 


// string list 
List<String> ys = new LinkedList<String> (); 
ys.add("zero"); ys.add("one") ; 
String y = ys.iterator().next(); 


// etring list lier 
List<List<String>> zss = new LinkedList<List<String>>(); 
zss.add(ys); 

String z = zss.iterator() .next().iterator().next(); 


// string list treated as byte list : 
Byte w = ys.iterator().next(); // compile-time error 





Example 2: Lists, iterators, linked lists, and the test class 
rewritten in Gy]. 
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For SOME, 


ENABLING TEAM & PARALLEL 
DEVELOPMENT 
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PROJECTS. 


For OTHERS, THERE'S 


Starleam 





ONLY STARTEAM BRINGS PREDICTABILITY AND 
MANAGEMENT TO YOUR DEVELOPMENT PROCESS! 


Getting all your technical resources in balance should not make your 
application development process more complicated and unpredictable. 
Fortunately, there’s StarTeam, an integrated suite of products designed to 
increase your team’s productivity, collaboration and project control. 
StarTeam delivers the easiest to use version control system together with 
an integrated project-based incident management system. Starleam’s 
multi vendor IDE integrations allow your developers to focus on what 


they do best, coding. 


StarBase 


It Pays TO PREDICT THE OUTCOME™ 


To team enable your development projects download your free 


evaluation at www.starbase.com or call 1.888.STAR700. 


©1999 StarBase Corporation. StarBase and StarTeam are registered trademarks of StarBase Corporation. All other brands and 
products are trademarks or registered trademarks of their respective holders. 
4 Hutton Centre Drive, Suite 800 * Santa Ana, California 92707, USA * Phone: 714.445.4400, Fax: 714.445.4404. 
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Example 3: Declarations for the Comparable interface 
and the Byte class that implements this interface. 


Example 4: Code rewritten in GJ. 








Byte erases to Byte), and the erasure of a type parameter is Ob- 


ject (so A erases to Object). A suitable cast is inserted around 


each method call where the return type is a type parameter. 
Translating the GJ code for lists yields the Java code you start- 
ed with (except for the line that was in error). Thus, a new pro- 
gram compiled against the GJ code could be used with an old 
library compiled against the Java code. 

Angle brackets were chosen for type parameters because they 
are familiar to C++ users, and because the other forms of brack- 
ets may lead to confusion. If round brackets are used, it is dif- 
ficult to distinguish type and value parameters. If square brack- 
ets are used, it is difficult to distinguish type parameters and 
array dimensions. If curly brackets are used, it is difficult to dis- 
tinguish type parameters from class bodies. 

Phrases like LinkedList<LinkedList<String>> pose a problem 
to the parser, since >> is treated as a single lexeme. (Similarly 
for >>>.) In C++, users are required to add extra spaces to avoid 
this problem. In GJ, there is no worry for users, the problem is 
instead solved by a slight complication to the grammar. 


Bridges, Generic Methods, and Bounds 

The next example demonstrates more advanced features of 
generics, including bridges, generic methods, and bounds. I'll 
consider Java code to find the maximum element in a list, then 
show how this is rewritten in GJ. 

In Java, objects that can be compared should be declared as 
implementing the Comparable interface. Each base type (such 
as byte or boolean) has a corresponding object type (such as 
Byte or Boolean). 

Example 3 shows declarations for the Comparable interface and 
the Byte class that implements it (both simplified from the Java li- 
brary). The method compareTo takes a parameter object and re- 
turns an integer that is negative, zero, or positive if the receiver ob- 
ject is less than, equal to, or greater than the parameter object. The 
Byte class defines two methods, one with signature compareTo(Byte) 
(which overloads the name compareTo) and one with signature 
compareTo( Object) (which overrides the corresponding method in 
the Comparable interface). The first method simply subtracts the 
2 bytes to determine how they compare. The second method casts 
the object to a byte and then calls the method; if the method is 
passed an object other than a byte, then a run-time error will oc- 
cur. The second method is required because overloading occurs 
only when types match exactly. It is called a “bridge” because it 
connects the first method (which is specific for bytes) to the in- 
terface method (which is generally all objects). 

Example 3 also shows a utility class, Lists, with a static method 
to find the maximum element in a list, and a test class. Method 
max takes a nonempty list of comparable elements and returns 
the maximum element in the list. The test class shows two sample 
calls to the method. As before, users must keep track of the result 
type and cast the results as appropriate. Booleans do not imple- 
ment the comparable interface, so an attempt to find the maxi- 
mum of a collection of Booleans raises an exception at run time. 

Example 4 is the code rewritten in GJ. The Comparable in- 
terface now takes a type parameter A, indicating the type com- 
pared to. For instance, the class Byte implements the interface 
Comparable<Byte>, indicating that a byte can be compared with 
another byte. The interface requires a method with signature 
compareTo(A), so you would expect this to be implemented in 
the class by a method with signature compareTo(Byte). Users 
do not need to write the bridge method, since it is automati- 
cally generated by the GJ compiler. 

The signature of the method max illustrates two interest- 
ing features of GJ— generic methods and bounds. Here is the 
signature: 


public static <A implements Comparable<A>> A max (List<A> xs) 
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Prevent coding confusion with CodeWizard. 





Would you believe that CodeWizard’ 


can prevent errors in your code and 
cut your debugging by 25%? 


"CodeWizard is the code reviewer without the attitude; it knows all the rules." -- Suzette LaGray Siemens ElectroCom L.P. 


| ee is a tool that helps you prevent errors by 
automatically performing static analysis on C++ code. 
The first thing you need to know about CodeWizard is that 
it can help you do less of the thing you hate the most: 
debugging. With CodeWizard, preventing errors before they 
happen 1s painless. 


CodeWizard automatically enforces over 70 C++ coding 
standards, which are language-specific rules that help you to 
avoid dangerous coding constructs. The standards are bor- 
rowed from the writings of Scott Meyers and other industry 
experts. CodeWizard clearly displays each of your violations 
and gives you all the information you need to fix them, 
including suggestions on better coding constructs. No more 
wondering about weaknesses in your code. Follow 
CodeWizard's coding suggestions, and you'll immediately 
begin to write cleaner, more reliable code. 


The key to CodeWizard is that you can configure the pro- 
gram to avoid overworking yourself. With a few clicks, you 


can tell CodeWizard to enforce only the standards that are 
most relevant to the current project. Using the new 
RuleWizard™ feature, you can even write your own coding 
standards with point-and-click convenience. Use 
CodeWizard and RuleWizard together, and you'll have a 
personalized tool for your entire development team. This is 
a tool that conforms to you—not the other way around. 


You won't have any problem adding CodeWizard to your 
arsenal of development tools, because it installs directly into 
Windows® and UNIX development environments. In 
Microsoft® Developer Studio®, you'll begin testing with a sin- 
gle click of the CodeWizard icon. On UNIX platforms, 
CodeWizard is a wrapper around the compiler. Learn a few 
simple commands, and you'll be on your way to cleaner code. 


If you would like to prevent errors, reduce debugging, and 
go home on time for a change, download your copy of 
CodeWizard today at wwwparasoft.com/story.htm, or call 
(888)305-0041 for more information. 


rz www.parasoft.com/story.htm 


ParaSoft’ 





ParaSoft is a registered trademark, CodeWizard is a registered trademark of ParaSoft Corporation, and RuleWizard is a trademark of ParaSoft Corporation. Microsoft Developer Studio and Windows are registered trademarks of Microsoft in the US and other countries. 





(continued from page 20) 

The signature says the method takes a list with elements of type 
A and returns an element of type A. The phrase in angle brack- 
ets at the beginning declares the type parameter A and indicates 
that the method can be instantiated for any type A that imple- 
ments the interface Comparable<A>. A method with its own type 
parameter is called a “generic method,” and a type parameter 
that must implement a given interface (or be a subclass of a giv- 
en class) is called “bounded.” 

The test class shows two sample calls to the method. In the first 
call, the compiler automatically infers that the type parameter A 
in the method signature must be instantiated to Byte in the 
method call, and it checks that class Byte implements the bound 
Comparable<Byte>. In the second call, the inferred type pa- 
rameter is Boolean, and it does not implement the bound Com- 
parable<Boolean>. So, where Java raises an exception at run 
time, GJ indicates an error at compile time. 

In general, a bound is introduced by following the parameter 
with “implements” and an interface, or “extends” and a class. 
Bounds are allowed wherever type parameters are introduced 
in either the head of a class or the signature of a generic method. 
The bounding interface or class may itself be parameterized, 
and may even be recursive, as in the example where the bound 
Comparable<A> contains the bounded type parameter A. 

The definition of erasure is extended so that the erasure of a 
type variable is the erasure of its bound (so A erases to Com- 
parable in max). As before, the translation introduces suitable 
casts, and it also introduces bridge methods where required. 
And as before, translating the GJ code yields the Java code you 
started with (except for the line that was in error). 

As you have seen, GJ code compiles into Java that looks much 
like what you would write if generics had not been available. 
As a result, you can often take legacy code and add type infor- 
mation, even if only binary class files are available, using a spe- 
cial retrofitting mode of the GJ compiler. 

For instance, say you have a class file for the Java version of the 
LinkedList class described earlier, but you wish to use it as if it has 
GJ types. Example 5 shows the necessary retrofitting file. 

To support independent compilation, the GJ compiler stores 
extra type-signature information in JVM class files (the class file 
format is designed to permit such additions, which are ignored by 
the JVM at run time). The retrofitter takes the existing Java class 
file, checks that its type signatures are the erasures of the GJ sig- 
natures, and produces a new class file with the GJ signatures added. 

The entire collection class library for Java 2 has been retrofitted 
in this way. Every public interface, class, and method in the li- 
brary— without a single exception— has an appropriate corre- 
sponding GJ type. Because retrofitted class files differ only in the 
addition of GJ type signatures (which are ignored by the JVM at 
run time), you can run the resulting code in a browser that is Java 
2 compliant without reloading the collection class library. 

In most cases, you would anticipate eventually rewriting the 
source library with parameterized types. The advantage of 
retrofitting is that you may schedule this rewriting at a conve- 
nient time — it is not necessary to rewrite all legacy code before 
new code can exploit generic types. 


Conclusion 
Java programmers often use a generic idiom where elements of 
a data structure are given the type Object. This is simple, but 





Example 5: The retrofitting file. 
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forces you to keep track of the actual type of the elements and 
add many casts. In contrast, adding generic types to the Java 
language adds a little in complexity, but now it is the compiler 
rather than you that keeps track of the type of the elements and 
adds the casts. 

In particular, generic types have the advantage of turning run- 
time exceptions into compile-time errors. Here’s what Bill Joy, one 
of the authors of the Java language specification, had to say about 
adding generics to Java Java One Conference, June 1999): 


We're continuing to work on the idea of catching more of the er- 
rors during development, putting a parameterized type system in 
the language. For me, it’s not so much to make the language more 
expressive, but to get rid of casts so there are less errors found af- 
ter you ship the code. 

After you ship, it costs you about 10,000 times as much to fix a 
software bug, and as a programmer, it’s also really annoying. If the 
bug is caught at compile time, you’re sitting near what’s going on. 
The code concepts are in your mind. If the bug report comes in 
from the field, it gets assigned to bug tracking and eventually makes 
it back to your desk or to somebody else’s desk, and the thinking 
that went into the original design isn’t there anymore. You have to 
reload your “cache” memory in your brain. So this whole idea of 
catching errors up front is a real advantage. 


If you see the class Object mentioned in a Java program, it is 
usually a sign that the program would benefit from the use of 
generic types. You might say that the class is well named — when 
you see it you should “object” and use generic types instead. 

GJ is a language design that extends Java with generics. It 
demonstrates that it is possible to add generics to Java in a sim- 
ple and usable way. Several large programs have been imple- 
mented in GJ, including the GJ compiler itself. As mentioned 
earlier, other languages support generic types in other ways. In 
particular, C++ supports generic types with templates. Although 
they have similar syntax and similar uses, C++ templates and GJ 
generics are implemented in quite different ways. 

C++ templates are implemented by expansion, making one 
copy of the code for each type where it is used (for instance, 
in our first example, there would be three copies, one for bytes, 
one for strings, and one for lists of strings)— this frequently 
leads to code bloat. Furthermore, because the template might 
be defined in one file and used in another file, errors caused 
by the expansion are often not reported until link time and can 
often be hard to track down. My colleagues Brian Kernighan 
and Rob Pike report on a small C++ program where templates 
generated a variable name 1594 characters long. (See The Prac- 
tice of Programming, Addison-Wesley, 1999.) 

In contrast, GJ generics are implemented by erasure, so there 
is no code bloat. All of the constraints that a type variable must 
satisfy are specified by the bounds on that variable, so all errors 
are reported at compile time, not link time. (This is just as well 
because Java has dynamic linking, so link time is the same as run 
time.) On the other hand, to make this work smoothly, type pa- 
rameters must always be a reference type like Byte rather than a 
primitive type like byte. So, until Java compilation technology im- 
proves, GJ generic types may be less efficient than C++ templates. 
(Perhaps this is no big deal, since until compilation technology 
improves, Java is less efficient than C++ anyway.) 

While GJ is designed to be eminently practical, it has its roots 
in some esoteric theory. Ideas that contributed to the design of 
GJ include Church’s lambda calculus from the 1930s, Curry and 
Hindley’s type inference system from the 1950s, and Girard and 
Reynold’s polymorphic calculus from the 1970s. GJ programmers 
don’t need to understand these concepts, but they helped the GJ 
designers to do a better job. Mathematics from the last century is 
relevant to designing languages for the next millennium. 
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Collaborative Applications 








What the JSDT can 
and can’t do for you 





Joshua Fox 


he Java Shared Data Toolkit (SDT) 

is a freely available class library from 

Sun Microsystems designed to help 

you write collaborative applications. 
Distributed collaborative systems, some- 
times referred to as groupware or multi- 
user applications, let groups of users 
work simultaneously on a common task. 
Typical collaborative apps include work- 
flow management systems, distance 
learning, video conferencing, and the 
like. (At VocalTec, the company I work 
for, we are developing a system that lets 
you surf the Web with another person 
while talking with him/her over IP tele- 
phony.) At this writing, the current re- 
lease is JSDT 1.5, although 2.0 is in Beta. 
The JSDT is not a standard Java exten- 
sion (with a javax package name); in- 
stead, it is an independent toolkit from 
Sun (with a com.sun package name)— 
one of the Java Media APIs. The JSDT 
works with Java 1.1.x, 1.2.x, and 1.3, and 
is available at http://java.sun.com/products/ 
java-media/jsdt/index.html. Note that, 


Joshua develops collaborative software at 


VocalTec Communications Ltd. He can be 
contacted at jtfox@usa.net. 
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while the JSDT is free, source code is 
not currently available. 

The JSDT’s strength is in grouping par- 
ticipants according to dynamic criteria set 
by the participants themselves. For in- 
stance, how do you make sure that Alice 
plays chess with Bob (both are masters), 
while Carol plays checkers with Doug, 
without sending Bob’s chess moves into 
the Carol-Doug face-off? And what if Al- 
ice wants to switch to checkers, or play a 





number of simultaneous chess matches? 
This is the sort of problem that the JSDT 
is designed to solve. 

The JSDT helps you determine who 
talks with whom. Yet the very nature of 
collaboration, in which distributed systems 
are combined with multiple user inter- 
faces, means that the JSDT cannot solve 
all the problems of collaboration for you. 
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Java Shared Data Toolkit 


In this article, I'll discuss the problems that 
JSDT can solve for you, the problems it 
fails to address, and problems that remain 
as an inevitable part of developing col- 
laborative systems. 


JSDT Transport Layers 

The JSDT runs over network transport lay- 
ers that pass information between two par- 
ticipants, while the JSDT coordinates the 
participants. The JSDT comes with a num- 
ber of these transport layers (called “im- 
plementations”), namely TCP and UDP 
sockets, HTTP, and the Lightweight Reli- 
able Multicast Protocol (LRMP); see 
http://webcanal. inria.fr/lrmp/index. html 
for a description of this protocol, and to 
download the necessary binaries. Of these, 
the socket implementation is the most ex- 
tensively tested and used, and thus is the 
only practical choice for most applications. 
In this article, assume that the socket im- 
plementation is being used unless other- 
wise specified. 

The socket implementation will not 
penetrate firewalls. The HTTP imple- 
mentation has the advantage of pene- 
trating firewalls (HTTP connections can 
be made to a servlet on port 80, which. 
gets through most firewalls), but is in- 
herently less scaleable than the socket 
implementation. In earlier JSDT versions, 
JSDT clients repeatedly (on the order of 
once per second) opened a new HTTP 
connection to the server to get informa- 
tion with an HTTP request/response, de- 
erading performance. JSDT 2.0 is slated 
to use persistent HTTP connections to 
solve this problem. 
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(continued from page 32) 

You can replace the transport layer. 
The JSDT has a set of interfaces that let 
you write your own implementation if 
you have some exotic protocol that you 
would like to use. A simpler way of pro- 
viding your own transport layer is to pro- 
vide a socket factory to the JSDT’s sock- 
et implementation— again, like RMI. You 
simply write a class that knows how to 
return a new instance of a subclass of 
java.net.Socket or java.net.ServerSocket. 
Because these Socket classes are de- 
signed for extension through the 
java.net.SocketImpl class, you can im- 
plement arbitrary transport layers with 
this hook. One built-in socket factory 
lets you use the Secure Socket Layer for 
encryption, although legal issues prevent 





a compatible library from being widely 
available. (There is an SSL library as part 
of the strong-encryption version of the 
Java Web Server, available in the United 
States and Canada only. This library 
needs to be extracted from the installed 
Java Web Server for that purpose, and 
distributed with all JSDT clients. Its size 
and legal difficulties make this impracti- 
cal. Alternately, other SSL libraries could 
be adapted to the purpose of the JSDT 
Socket Factory.) 


The Challenges of 

Collaborative Systems 

Collaborative applications over the net- 
work combine the challenges of dis- 
tributed systems with those of user- 
interface systems—with multiple users 
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involved. User interfaces require a fast re- 
action and instantaneous feedback— no 
one wants to stare at the keyboard won- 
dering where the keystroke went. Dis- 
tributed systems, on the other hand, de- 
pend on a network of unknown size and 
reliability to transfer data. This poses two 
problems: 


e Nondeterministic latency: not knowing 
how long data will take to traverse the 
network. 

e Partial failure: A remote part of the sys- 
tem may fail or become disconnected, 
but other parts of the system should 
keep on working. 


The phone system is a high-quality 
worldwide system with many similarities 
to collaborative software. The rare ex- 
ceptions to the phone system’s high lev- 
el of quality provide some good exam- 
ples of problems in such systems. When 
Alice doesn’t hear Bob’s voice for a few 
seconds, she is likely to say: “Hello, Bob, 
are you there?” In fact, there is no way 
for her to distinguish between a dead 
line and a taciturn Bob, other than ask- 
ing him to speak up. Likewise, when 
Bob hears an echo of his voice, or a de- 
lay in Alice’s voice, it becomes almost 
impossible to communicate. Not only in 
telephone conversations is instant feed- 
back needed—we need it in collabora- 
tive software as well. The Internet al- 
ways poses problems of latency and 
partial failure, and the JSDT cannot elim- 
inate them for you. Your application lay- 
er will have to work around them. 

In any distributed system, a great many 
things can go wrong. JSDT provides a 
wide variety of exception types (24 at last 
count) to let you check what went wrong. 
In general, all you care about is that some- 
thing went wrong, so you can just catch 
JSDTException, the superclass of all JSDT 
exceptions. 

Any distributed system needs a way of 
checking and cleaning up a crashed con- 
nection; this should allow for trying to re- 
connect if that is warranted. Although TCP/IP 
sockets, for example, have some keep-alive 
functionality, the process of checking and 
restoring the connection is best conducted 
at the application level. (In fact, the JDK 
java.net.Socket class specifically excludes 
access to the TCP/IP keep-alive option, 
which you can set in C, for example. See 
the JDK 1.2 guide to Socket Options in Java; 
http://java.sun.com/products/jdk/1.2/docs/ 
guide/net/socketOpt.html.) An addition to 
the JSDT 2.0 is the Connection class that 
can let a ConnectionListener know when 
the connection has failed. This helps you 
implement your own application-level sys- 
tem for restoring connections over the JSDT; 
see Listing One. 
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The JSDT and RMI 
There are a number of similarities between 
the JSDT and RMI. Both are Java protocols 
for exposing shared objects for use in dis- 
tributed systems. Both have a registry, 
which can be independent of any specific 
server, for looking up shared objects. Oth- 
er similarities include socket factory and 
fall-back firewall penetration using HTTP. 
The difference between the two is in 
their purpose: RMI focuses on connecting 
two participants with method calls, while 
letting the participants find each other. 
The JSDT, on the other hand, focuses on 
controlling which participants talk with 
each other. In RMI, you link precisely two 
clients with a remote object, and the re- 
mote reference is accessed or discarded 
much like local object references, while 
in the JSDT, you can precisely monitor 
and control multiple clients as they join 
and leave shared objects. In RMI, you can 
make your own classes remotely accessi- 
ble in the registry, while JSDT shared ob- 
jects fall into only a few predefined 
types— the Registry, Session, Channel, 
ByteArray, Token, and Client Listener. 


Fully Distributed or 

Client/Server Architecture? 

The architecture of distributed systems is 
always in tension between client/server 
and fully distributed systems. Does a cen- 
tral server route all data between clients? 
Does a server decide who talks with 
whom? As you design your JSDT-based 
system, you should realize that the JSDT 
imposes restrictions on the nature of the 
distribution in your design: In brief, there 
must be a central server, but the client ap- 
plications must be in charge of their own 
state. You can avoid missteps if you fit 
your architecture to the JSDT-imposed de- 
sign restrictions. 

Some sort of server is unavoidable with 
the JSDT. The JSDT requires a registry — 
a way for distributed participants to locate 
shared objects such as Sessions. In con- 
trast, other technologies provide ways to 
get around the need for a central registry 
in a distributed architecture. Jini does it 
by letting users multicast their search for 
a lookup service. Still, this multicast is only 
practical in a small network— you can’t 
search the entire Internet for your lookup 
service. Likewise, Internet DNS lookup is 
not completely centralized; rather, each 
local zone has a lookup service that can 
turn to wider area— and therefore more 
centralized— lookup services as neces- 
sary. The JSDT also requires that the serv- 
er start all Sessions if you want to work 
with unsigned applets, which may com- 
municate only with the server from which 
they were downloaded. In addition, you 
will need to start your Sessions on a serv- 
er, because once the client who starts a 
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shared object exits it, other clients are un- 
able to keep using it. Fortunately, after 
you have set up the shared objects, the 
JSDT does the data-passing work trans- 
parently, freeing you from writing code 
to pass data from client to server and then 
to the other clients. To do it yourself 
would require some challenging mullti- 
threading work to switch messages be- 
tween dynamically changing configura- 
tions of clients. 

Collaboration at its purest lets each par- 
ticipant control its own state, and the de- 
sign of the JSDT encourages this— each 
client calls the appropriate method to join 
a Session or Channel, rather than being 
placed there by an outside entity. A par- 
ticipant can learn another client’s behav- 
ior, send invitations to change, or even 
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compel change, but the logical center of 
each client’s state is in that client. 


Controlling Data Sharing 

Beyond JSDT-imposed centralization re- 
quirements, your design logic might need 
a central controller to decide who talks to 
whom. You might want to require autho- 
rization for joining a Session or other 
shared object, or Clients might be re- 
quested to join Sessions based on some 
centralized logic. For example, two game 
players who connect to the system in se- 
quence might be assigned a new Session 
to play against each other, or Clients with 
names that are known from a database 
might be assigned to the same Session 
each time they connect to the system. A 
number of features of the JSDT allow this 
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sort of central control. In the JSDT API, 
these features are not reserved for a cen- 
tral server. A central controller in your sys- 
tem is, in JSDT terms, simply another par- 
ticipant for whom you have implemented 
these control features. 

Monitoring and control of shared ob- 
jects is provided by Managers, Listeners, 
and Client Listeners; see Table 1. 

You can add a Manager to shared ob- 
jects such as Sessions, Channels, and the 
registry. When an attempt is made to cre- 
ate, destroy, or join a managed shared ob- 
ject, the Manager issues a challenge in the 
form of a Java Object and the Client gives 
a response in the form of another Object, 
which Managers must approve. To make 
the authentication secure, you'll have to 
add encryption. For most applications, se- 
curity will not need to be added. 

Where the Managers let you control ac- 
tions, shared-object Listeners let you mon- 
itor actions. Listeners such as RegistryListener 
and SessionListener are call-back interfaces 
that are notified when someone performs 
an action such as creating, destroying, join- 
ing, or leaving the shared object. 

Managers and Listeners assume an ac- 
tive Client that attempts to join and leave 
the shared object on its own initiative. If 
you want Clients to receive requests to join 
or leave shared objects, the Clients can en- 
ter themselves in the registry as Client Lis- 
teners, a type of shared object that makes 
it possible to look up the Clients, then get 
requests. Another participant in the col- 
laboration system can look up the Client 
Listener and request that it join or leave a 
shared object. A ClientListener object will 
receive an event through a call-back 
method and act accordingly. 

Whatever degree of central control you 
put into your architecture, the Clients, not 
the central controller, still contain the in- 
formation on the distributed objects they 
belong to. This can pose problems if you 
know what a Client should Ba doing, but 
: package com.sun. media, wee 
Shared Objects Managers "Factories 
elie 
Channel ‘ChannelManager Session 
_ ByteArray ByteArrayManager Session 


Token -—TokenManager Session 
“Client oe 
: Registry 7 


Connection*** - -- 





ae a a 
SessionManager SessionFactory SessionListenor 
‘Channellistener 
_ ByteArrayListener ByteArrayAdaptor 
eee 
-ClientFactory _ -ClientListener 


_ RegistryManager RegistyFactory RegistryListener 


it is not connected. For example, if Bob 
asks to play a game with Alice, and Alice 
has not yet connected (but might do so in 
a minute), you will have to store this in- 
formation in your own data structure. The 
moment that Alice connects, you must move 
her Client to the Session. At that point, you 
have to erase this information from your 
own data structure, since it is now dupli- 
cating information stored in the JSDT. Also, 
if Alice’s application is participating in one 
of the Sessions in your collaboration sys- 
tem, the fact that she is in that Session is 
stored in her client-side JSDT component. 
Unless you duplicate that information in 
your own code, when she closes and re- 
opens her application, the information on 
what Session she was in will be lost. 
When it joins shared objects, a central 
controller has to have a Client, and is mere- 
ly another participant as far as the JSDT is 
concerned. To simply listen to events, it 
does not need to join. But to perform ac- 
tions on a shared object (such as destroy- 
ing it), it must join the shared object using 
a Client. You must take care in your de- 
sign to distinguish controllers from partic- 
ipants who actually share data. For exam- 
ple, you often want to destroy a Session 
when it is empty—when the last Client 
leaves it. Your controller could listen to 
events from the shared object. When clients 
leave, it would check that there are zero 
clients, join the shared object with a Client, 
then destroy it. To simplify this, a flag for 
distinguishing participants from controllers 
would be a welcome addition to the JSDT. 


Port-Binding Problems 

There is an unfortunate coupling between 
Client Listeners in the same physical ma- 
chine. A java.net.ServerSocket is bound to 
the given port for Client Listeners as well 
as for Sessions. Therefore, ports for Client 
Listeners have to be unique per machine, 
and distinct from any Session ports. If you 
know ois many Client Listeners there will 
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be per machine, assign them well-known 
ports. If you cannot know in advance how 
many Client Listeners there will be in a 
machine, you can dynamically assign ports 
to Client Listeners to avoid collision: Catch 
PortInUseException, increment the port 
number, and try registering the Client Lis- 
tener again (see Listing Two). This run- 
time port assignment makes it difficult to 
know how to find the Client Listener. The 
port is no longer well known. However, 
by using ClientFactory.listClients(), you 
can get a list of Client Listeners that you 
can iterate through, looking for the Client 
Listener with a given name, then ask for 
that Client Listener’s port. The complexi- 
ty of this procedure means that you will 
probably want to limit yourself to one 
Client Listener per application, but you 
will still have to account for the possibil- 
ity of multiple virtual machines (VMs) on 
a hardware machine competing to bind 
their Client Listeners to a port. 

Similar problems arise with binding mul- 
tiple Sessions to the same port. For Sessions, 
the problem is with opening two Sessions 
with different names in different VMs, us- 
ing the same port on the same physical ma- 
chine. (When you open two Sessions on 
one port in the same VM, that’s okay, since 
the two Sessions are multiplexed over the 
port.) So, when writing two distinct JSDT 
servers, be sure to use two distinct ports 
for any Sessions that they create. 

The port-binding problems will arise more 
frequently in development, when you will 
often be running everything on one ma- 
chine (localhost), than in production, where 
there is usually one application on each 
physical machine. Simply being aware of 
the port-binding issues as you design your 
system will solve many of the difficulties. 


Entering and Leaving Sessions 

The Session is the central feature of the 
JSDT. It represents a group of participants 
interested in communicating with each 
other. You get a reference to a Session ob- 
ject with SessionFactory.createSession( ) 
method, which creates the Session if it 
does not already exist, or alternatively re- 
turns a reference to an existing Session. 
Calling this method also lets you join the 
Session. The createSession() method is 
where you are likely to get more JSDT- 
Exceptions than in any other method call, 
since this is where a connection is opened 
in the underlying implementation. 


The Channel and the ByteArray 

While the Session is a grouping of Clients, 
the Channel and ByteArray are ways for 
those Clients to share data (Table 1). If the 
group of Clients is to share more than one 
type of data, you will want to use more 
than one Channel or ByteArray, rather than 
tagging the data with its type. These two 
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distributed objects are quite similar: They 
both pass information in the form of byte 
arrays, strings, or objects. The difference is 
in the way data is received. When you 
transmit data over a Channel, it arrives ac- 
tively at the other side, asynchronously 
through a call-back interface, or syn- 
chronously through a blocking method. On 
the other hand, when you place data in a 
shared ByteArray, the data just sits there 
waiting for someone to read it. However, 
because a ByteArray can produce an event 
indicating that its value has changed, the 
choice between Channel and ByteArray is 
mostly a matter of convenience. 


Strings, Byte Arrays, or Objects? 
Information transmitted in the JSDT is en- 
capsulated either in shared ByteArray ob- 
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jects or in Data objects sent through a 
shared Channel. In either case, you can 
get or set the data as a byte array, string, 
or serializable Java object. Each of these 
provides a different way of encoding data 
in a protocol specific to your application: 
You can encode data in byte arrays with 
primitive types in a predetermined order, 
in strings with delimiters and keywords, 
or in objects that you define. 

Unless your data is quite simple, in 
which case you can send it as a string or 
byte array, you will probably want to 
transmit objects. An object knows how to 
provide information about itself. Just as 
RMI, CORBA, and DCOM gain their pow- 
er from their object-based protocols, you 
can gain the same advantages in your 
JSDT application. An object does not need 
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external parsers to act on the basis of in- 
formation it conveys— it can carry out the 
action itself when it arrives at its target. 
(Compare the Command design pattern.) 
You will find it easy to change your de- 
sign by adding fields and methods to an 
object. The compiler’s type checking 
makes sure that both the sending and re- 
ceiving side recognize the same encap- 
sulation of the data. Moreover, objects al- 
low random access to information: You 
can read fields or call methods without 
parsing your way through a string. Ob- 
jects also provide the advantage of poly- 
morphism for different types of data, so 
that different subclasses can have differ- 
ent effects on the receiving side. Just don’t 
forget to make your class implement 
java.io.Serializable and to give it a pub- 
lic default constructor. 

The only disadvantage of object serial- 
ization is that it is quite slow. In most cas- 
es, however, the rate-limiting factor is like- 
ly to be the human user or the network, 
not the serialization. If you intend to use 
the JSDT for high data-rate applications, 
such as multimedia streaming (the JSDT 
is fast enough to do this), then you will 
want to use byte arrays filled with primi- 
tive data types. A look at the “phone” and 
“sound” examples in the JSDT release 
makes for an instructive comparison be- 


tween the two techniques for sending au- 
dio packets: In the phone example, au- 
dio is sent in SoundPacket objects, each 
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of which encapsulates a byte array of au- 
dio data. But in the sound example, the 
AudioClick object (which encapsulates a 
byte array) is not sent over the JSDT Chan- 


and Linux! 


ut the bounce back 

P in sluggish Java 
programs with Optimizelt 
3.0 Professional. 

Optimizelt’s real-time 
data and easy-to-read 
graphs provide the 
information you need to 
solve performance 
bottlenecks, excessive 
object allocations, 
memory leaks. And with 
asynchronous data 
fetching, sampling and 
advanced profiling techno- 
logies, Optimizelt lets you 
take on the largest Java 


environments. 








Now available for Windows, Solari 





BES G48 4 
Best Performance 
and Testing Toots 





nel—rather, the byte array of sound is 
sent directly. 


The Token 

Java made single-machine multithread- 
ing much easier by including synchro- 
nization in the language. With distribut- 
ed systems, synchronization becomes 
much more difficult. The JSDT Token lets 
you synchronize client applications. A 
Token resembles a Java monitor in some 
ways: One Client grabs it, and other 
Clients cannot grab it until it is released. 
Just as local threads wait on a monitor 
for a synchronized block to exit, so JSDT 
Clients listen for the release of a Token 
with a TokenListener. 

Other comparisons between local mon- 
itors and JSDT Tokens come to mind. 
When you use synchronization in a sin- 
gle Java VM, you want to avoid deadlock, 
in which one 7hread is holding a moni- 
tor, waiting to release it until signaled by 
another Thread, while the other Thread 
is stuck waiting to receive the same mon- 
itor. Fear of deadlock is why the suspend() 
and resume() methods of Thread were 
deprecated in Java 2. On the other hand, 
waiting for a Token to be freed does not 
have to freeze a client (and the 7Joken- 
release event arrives at the TokenListen- 
er in a separate thread). Thus, distributed 
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deadlock can be avoided in JSDT appli- 
cations. Still, you might have to wait a long 
time for a Token, given the latency inher- 
ent in distributed systems. 

In single-machine applications, Threads 
should not exit without releasing moni- 
tors. This is why the destroy() method of 
Thread was never implemented. The JSDT 
avoids this problem by releasing any To- 
kens held by a Client that disconnects. 

This can, however, cause problems. To- 
kens, like Java monitors, might typically 
be used to lock a resource while it is be- 
ing modified. If an unexpected crash oc- 
curs, data might be left in an inconsistent 
state. The classic example is a bank ac- 
count. When an application locks access 
to the account to make a deposit, and then 
loses its connection, how do you know if 
the deposit has been made? This problem 
was addressed for single-machine Threads 
in Java 2 by deprecating the stop() method 
of Thread, and relying on you to make 
Threads exit cleanly. In distributed ap- 
plications, however, disconnection is of- 
ten unpredictable and unavoidable. There 
is no built-in transaction mechanism in 
the JSDT, so you must supply your own 
application-level transactions on top of 
the JSDT Tokens. 


JSDT and Applets 

It’s hard to use the JSDT with applets. In 
fact, it is difficult to use any nonstandard 
library, including the Java Foundation Class- 
es, in an applet, because of the large JAR 
file that your users may have to download 
every time. Even the reduced client-side 
JAR file provided with the JSDT distribu- 
tion is 168 KB, to which you must add your 
own code. One solution, applicable to the 
JSDT as to other large JAR files, is to un- 
zip the JAR, remove the ARCHIVE tag from 
your HTML APPLET tag, put your applet 
through its paces, then examine your web 
server's log files. With a script to filter list- 
ings of *.class files served, you can cre- 
ate a new JAR file from the exact subset of 
the JAR that you use. 

Another limitation on the use of applets 
with the JSDT is that you can’t use Client 
Listeners in unsigned applets, since Client 
Listeners open a ServerSocket. You can get 
around this by signing your applets (sep- 
arately for Netscape and Internet Explor- 
er, of course; see “Creating Signed, Per- 
sistent Java Applets,” by Paul Brigner, DD/, 
February 1999), and getting user permis- 
sion to open the ServerSocket. 


Resource Management 

When you write in Java, the garbage col- 
lector can make you forget your C++ dis- 
cipline about managing memory. Even in 
Java, though, you are still responsible for 
the cleanup of nonmemory resources such 
as sockets, threads, and file handles. Like- 
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wise, JSDT client applications have to clean 
up their Session with close(). You must 
take care to do this when you no longer 
need the Session, but not to do it if you 
might still need it. 

You should also be aware of variants 
in the close(boolean closeConnection) 
method of the class Session. If you call 
close(true), then all clients in your VM will 
lose their connection to that Session, not 
just the client for which you called close(). 
This is almost always harmful, except 
when you are exiting the application. To 
allow Clients to run independently, you 
must call close(false). 


Conclusion 
Although it is a new tool, the JSDT is be- 
ing rapidly improved and debugged. It is 
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useful for writing collaborative applica- 
tions, particularly if they involve complex 
and dynamic groupings of participants that 
need to be controlled by a combination 
of a central server and the participants 
themselves. 
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Listing One 


import com.sun.media.jsdt.Connection; 
import com.sun.media.jsdt.JSDTException; 


import java.net. InetAddress; 

import com.sun.media. jsdt.AuthenticationInfo; 
import com.sun.media.jsdt.Client; 

import com.sun.media. jsdt.URLString; 

import com.sun.media. jsdt.ClientFactory; 
import com.sun.media.jsdt.PortInUseException; 
import com.sun.media.jsdt.event.ClientEvent ; 
import com.sun.media.jsdt.event.ClientAdaptor; 


public static void main (String [] args) { 


// Register a Connection Listener, which will receive 
// notification when the connection fails: 


try { public class MyClient implements Client { 


Connection. addConnectionListener ("www.my-jsdt-server.com", "socket", 
new KeepAlive()); 


private String myName; 


} catch (JSDTException jsdte) { 


} 


ie 


/* Class that cleans up and tries to reconnect when the connection is lost. */ 
.media.jsdt. 
media.jsdt. 
.media.jsdt 
media. jsdt. 
media. jsdt. 


import 
import 
import 
import 
import 


com.sun 


com.sun. 


com. sun 


com.sun. 
com.sun. 


Connection; 
JSDTException; 


. Session; 


event.ConnectionEvent; 


event.ConnectionListener; 


public int clientListenerPort = 5661; 
// 


private void registerClientListener(Client client) {( 


while (true) { 
try { 


// The last parameter for the Client Listener 


// URLString MUST be the same as the name of the Client object. 
URLString clientListenerUrl= URLString.createClientURL ( 
"www.my-jsdt-server.com", clientListenerPort, 


"socket", 
ClientFactory.createClient (this, 


this. getName ()) ; 


public class KeepAlive implements ConnectionListener { 
* Call-back method from ConnectionListener interface. The connection has 
* failed--let's hope that it is restored eventually. Try to reconnect 
* at 2@ second intervals. */ 
public void connectionFailed(ConnectionEvent event) { 
disconnect(); // clean up just in case 
boolean succeeded = false; 
while (!succeeded) { 
try { 
connect (); 
succeeded = true; 
} catch (JSDTException jsdte) { 
succeeded = false; 
} 
try { 
Thread.sleep(2@ * 10@@L) ; 
} catch (InterruptedException ie) ( 


} 
} 


private void connect() { 


clientListenerUrl, new MyClientAdaptor()); 
break; 
} catch (PortInUseException piue) ( 
clientListenerPorttt+; // Retry after incrementing port number 
} catch (Exception e) { 
break; 
} 


} 

} 

public String getName() { 
return myName; 


public Object authenticate(AuthenticationInfo ai) { 
return null; 
} 


/* Implementation of ClientListener. It listens for commands on behalf 
* of your client applications. Like the Swing Adaptor classes, the JSDT 
* Adaptors provide an empty implementation of all methods of Listener, 
* so that you can implement just those methods that interest you. 
* 
private class MyClientAdaptor extends ClientAdaptor { 
/* Examples of commands that can be sent to the Client Listener. */ 
public void sessionInvited(ClientEvent event) { 


} // Now that you've been invited, you'll 
private void disconnect() { // probably want to connect to the Session, 
het i % 
} } 
VP = 4 4 public void sessionExpelled(ClientEvent event) { 
} // You've been expelled from the Session. Unlike 


: // sessionInvited, you do not need to do anything 
° ’ . 
Listing Two // to leave the Session, since you have already been expelled. 
/** This is a JSDT Client which registers a Client Listener, incrementing } 
* its port number if the port it tries is already bound. } 


*/ DDJ 
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Java References 





Working with the 
garbage collection 





Jonathan Amsterdam 


f all the features in Java 1.2, refer- 

ences are the most accessible and 

the most mysterious. Accessible be- 

cause they are closely tied to Java’s 
garbage collector, mysterious because it 
is not entirely clear what they are. 

The idea behind Java references is easy 
to understand: They let a program refer 
to objects without preventing those ob- 
jects from being garbage collected. There 
is also a way to obtain control just before 
an object is collected, so that you can per- 
form clean up actions. That’s the story in 
a nutshell. But references are a low-level 
feature, difficult to reason about and to 
use correctly. In addition to explaining 
how references work, I'll present some 
useful abstractions that make working with 
references easier. 


Why References? 

While a program is running, the garbage 
collector occasionally seeks out all the ob- 
jects that the program can access. These 
reachable objects consist of those point- 
ed to by class variables, those pointed to 
by local variables in the currently active 
methods of all threads, and any other ob- 
jects reachable from the aforementioned 
objects by following pointers. All other 


Jonathan is an Adjunct Associate Profes- 
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objects are unreachable— the program 
will never be able to access them again. 
If they can’t be accessed, they can’t pos- 
sibly affect the computation. So, these un- 
reachable objects are garbage and their 
storage can be reclaimed. 

The principle that an object is garbage 
if and only if it is unreachable is obviously 
correct. Unfortunately, it is a little too cor- 
rect sometimes. To take one example, if 


you have a way of reconstructing an ob- 
ject, either by performing some compu- 
tation or reloading it from a file, then you 
may be willing to let the garbage collec- 
tor reclaim it if memory is tight. A “soft” 
reference can handle this case. Another 
situation occurs if you’re keeping a table 
of information around, keyed by object, 
the only purpose of which is to serve oth- 
er parts of the program. When one of the 
key objects becomes garbage, then you’d 
like to remove it and its associated infor- 
mation from the table. “Weak” references 
are used in this situation. 

A third kind of reference —“phantom” 
references — are really just another way 
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to be notified when an object is garbage 
collected, much like the finalize method. 

Each kind of reference is represented 
by a subclass of Reference in the java 
lang.ref package. By passing an object to 
the constructor of one of these classes, 
you obtain a reference to the object. 

A clarification before delving into the 
details: In Java parlance, the term “refer- 
ence” is used for the normal relationship 
between a variable and an object: 


String s = "On Sense and Reference"; 


You might say that s holds a reference to 
the string. But I prefer to reserve refer- 
ences for one of the three special rela- 
tionships just mentioned. Instead, I’ll des- 
ignate s as a standard pointer to the string. 


Soft References 

Consider a special case of the first situa- 
tion I described. Say you have a large ob- 
ject that is stored in a file, perhaps in se- 
rialized form. You must load it into 
memory to work with it, and you'd like 
to keep it around, space permitting, but 
you also want to give the garbage collec- 
tor the option of freeing the object when 
necessary. 

By using a soft reference to your ob- 
ject Gnstead of a standard pointer), you 
can still access the object while allowing 
the garbage collector to reclaim it. More 
precisely, if an object can be reached only 
via soft references, then the object can be 
reclaimed. The garbage collector would 
never reclaim an object that is reachable 
through a standard pointer, no matter how 
many soft references to it existed. 

To use soft references, first get your 
object: 


Object obj = readObjectFromFile(. ..); 
The variable obj holds the object in the 
normal Java way— it’s a standard pointer. 
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(continued from page 42) 
Now pass the object to the SoftReference 
constructor: 


SoftReference ref = new SoftReference(obj); 


Then make sure that there are no stan- 
dard pointers to your object: 


obj = null; 


Now, you may be able to retrieve your 
object with the get method: 


obj = ref.getQ); 


On the other hand, get may return null, 
indicating that the garbage collector has 
reclaimed your object and cleared the ref- 
erence. In this case, if you really want the 
object back, you will have to recreate it. 

Listing One is a class called SoftObject 
that embodies this pattern. It maintains a 
soft reference to an object. Its get method 
is guaranteed to return the object Cif it 
doesn’t throw an exception). If the refer- 
ence has been cleared, the retrieve method 
is called and a new soft reference is cre- 
ated. (You can’t reuse the old one— ex- 
cept for being cleared, references are im- 
mutable.) 

Subclasses of SoftObject may implement 
the retrieve method as desired. You might 
perform a computation or download the 
object over the network. (SoftObjects 
would be particularly useful for image and 
sound file downloading.) A third possi- 
bility, reading the object from a serialized 
file, is shown in the FileObject class in List- 
ing Two. If the file sense.ser contained a 
serialized representation of a String, you 
could use FileObject to access the String 
like so: 


SoftObject fo = new FileObject("'sense.ser"); 


String s = (String) fo.getQ; 
display(s), 


The call fo.get() will return the String im- 
mediately if it’s available, or read it from 
the file. As long as there is a standard 
pointer to the object, such as s or the pa- 
rameter of the display method, the object 
will not be garbage collected. 

There are dangers in working with ref- 
erences akin to those involving multiple 
threads, because the garbage collector be- 
haves much like a separate thread. Con- 
sider these two lines from the get method 
of SoftObject: 


result = retrieve(); 
ref = new SoftReference(result); 


A seemingly equivalent formulation is: 


ref = new SoftReference(retrieve()); 
result = ref.get(); 


But there is a problem with this code. If 
the garbage collector runs after the first 
line but before the second, it may reclaim 


tt 


the object and ref-get() will return null. 
The first version doesn’t have this prob- 
lem because it puts the newly retrieved 
object into a standard pointer before cre- 
ating the reference. 

Java’s only guarantee about a soft ref- 
erence is that it will be cleared before the 
system runs out of memory. But the hope 
and the intent is that implementations will 
choose carefully, which soft references to 


Use weak 
references when you 
don’t need 
sophisticated 
memory 
management 





clear when memory is low, to provide the 
best possible performance. For instance, 
an implementation might prefer to clear 
soft references that haven’t been accessed 
in a while. 


Weak References 

Weak references share with soft references 
the property that the garbage collector is 
welcome to release the contained object 
if no standard pointers to it exist. The most 
important difference between them is that 
no clever algorithms will be applied to 
clearing weak references. A weak refer- 
ence is used simply to allow an otherwise 
unreachable object to be reclaimed. The 
difference is subtle and is best illustrated 
with an example. 

As you may know, any Java string lit- 
erals in the same program that are spelled 
the same are represented by the same 
String object in memory. For instance, 
"Frege" == "Frege" is true, even though (in 
general) s.equals(t) should be used to 
compare two strings s and f¢. You can get 
the same effect by calling the intern 
method of the String class— if s.equals(v), 
then s.intern() == t.intern(). In other 
words, intern returns the same object for 
all strings that are equal to one another. 

When a single object is used to repre- 
sent a potentially large group of equal ob- 
jects, that object is called the “canonical 
object” for the group. (A related idea is 
the Flyweight pattern described in Design 
Patterns, by Erich Gamma et al.) Using 
canonical objects saves space, because 
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fewer objects need actually be in memo- 
ry. And it saves time, because the object 
identity (the == operator, which can be 
done in one machine instruction) can be 
used in place of object equality (the equals 
method). 

Lisp symbols are another example of 
canonical objects. Symbols represent vari- 
ables in Lisp and consist of a name and a 
value; the Java version would be: 


class Symbol { 
String name; 
Object value; 


In a running Lisp program, there is only 
one symbol with a given name. Symbols 
are interned, just like Java literal strings. 

It is easy to implement canonical ob- 
jects using a hashtable or other mapping 
data structure, such as Java 1.2’s HashMap. 
You can keep the canonical objects in the 
table, and canonicalize (intern) new ob- 
jects by looking them up in the table and 
adding them if not present. 

A first version of a Lisp symbol class in 
Java appears in Listing Three. If the Sym- 
bol.intern method finds a symbol in the 
table corresponding to the string, it is the 
canonical symbol, and is returned XXXX. 
If it doesn’t find a symbol, it creates a new 
one, which becomes the canonical sym- 
bol for that name. Since the constructor is 
private, the only way to create a symbol 
is via the intern method. Thus you can 
guarantee that there is only one Symbol 
with a given name — if s.name.equals- 
(t.name), then s == t. 

There is just one problem: As more and 
more names are interned, the size of the 
table grows without limit. Even if a sym- 
bol is no longer reachable by the program, 
and should be garbage, the pointer to it 
in the symbol table will prevent it from 
being garbage collected. (This is actually 
the correct behavior for an interactive Lisp 
interpreter, where users can type in a sym- 
bol name at any time, but it is not right 
for a standalone program.) 

The solution, of course, is to use Java 
references. If the symbol table holds weak 
references to Symbols instead of the Sym- 
bols themselves, then the table will not 
prevent the garbage collector from re- 
claiming an unreachable Symbol. To adapt 
the Symbol class to use weak references, 
change the intern method to that of List- 
ing Four. 

Weak references are more appropriate 
in this case than soft references, for two 
reasons. First, there is no complicated 
memory juggling going on here as there 
was in our first example. Of course, if 
space were infinite, we wouldn’t need to 
bother with references. So, in that sense, 
memory is still our concern. But we don’t 
want the system to waste its time applying 
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(continued from page 44) 

clever algorithms to determine which 
weak references should be cleared first. 
Once a Symbol is garbage, it’s garbage, 
and any weak reference to it should be 
cleared pronto. 

There is a second reason for preferring 
weak references to soft references where 
canonical objects are involved. A soft ref- 
erence may be cleared even if weak ref- 
erences still exist, but the opposite will 
not happen. In other words, consider a 
situation in which an object is reachable 
by one soft reference and one weak ref- 
erence, and nothing else. Then the soft 
reference will be cleared before the weak 
one. This can violate the correctness of 
canonical objects if the table is imple- 
mented with soft references. 

Here is a scenario that demonstrates 
the problem. Assume your symbol table 
were to use soft references. First, you 
intern a new symbol and keep a weak 
reference to it: 


WeakReference r = 
new WeakReference(Symbol 
intern("Gottlob")); 


There are now only two references to the 
symbol: the weak reference 7 and the soft 
reference inside the table. The soft refer- 
ence will be cleared first. After it is cleared, 
you intern the same string, causing a new 
canonical symbol to be created: 


Symbol s1 = Symbol.intern('Gottlob"); 


Now you retrieve the original canonical 
symbol from the weak reference: 


Symbol s2 = (Symbol) r.getQ); 


If the weak reference has not yet been 
cleared, then s7 and s2 are two different 
Symbols with the same name, violating the 
rule that governs canonical objects. 

This problem occurs because the clear- 
ing of a soft reference does not imply that 
the object is completely unreachable. When 
a weak reference is cleared, there is truly 
no way to reach the object (not even 
through another weak reference— the 
specification requires that all weak refer- 
ences to an object are cleared atomically). 

To summarize, you use weak references 
instead of soft references to implement 
canonical objects when you don’t need 
sophisticated memory management, but 
you do require that once removed from 
the table, the canonical object can never 
be retrieved. 


Reference Queues 

There is still a problem with the table of 
symbols. Although weak references allow 
the symbols to be garbage collected, the 
weak reference objects themselves— and 
the space they use in the table— are 
reachable through standard pointers and 
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will not be reclaimed. The table entry for 
a symbol should be removed when the 
symbol is collected. 

You will quickly appreciate that the so- 
lution is not to have another level of weak 
references to the existing weak references. 
Where would you store these new refer- 
ence objects? Taking another track, you 
could set up a thread that periodically 
scans the table and removes cleared ref- 
erences and their keys, but much of that 
thread’s effort would be wasted examin- 
ing uncleared references. Ideally, you 
would like to be notified whenever a ref- 
erence is cleared. 

Reference queues do just that. If a ref- 
erence is created with a reference queue, 
then it will be placed on that queue after 
it is cleared. The program can periodical- 
ly check the queue and perform any 
cleanup operations associated with the 
queued references. It can do that in a sep- 
arate thread, or as part of another activi- 
ty. A simple and natural choice for the 
symbol table is to clean up each time in- 
tern is called. 

Rather than modify our symbol table, 
let me present a generalization of it that 
incorporates reference queues. Called 
CanonicalTable, it resides in Listing Five. 

You typically create a CanonicalTable 
with a factory object, which is used to cre- 
ate new canonical objects when one is 
not found in the table. The factory for the 
Symbol table would call the Symbol con- 
structor. Besides a factory, an instance of 
CanonicalTable contains a HashMap and 
ReferenceQueue. (The ReferenceQueue 
class is also in java.lang.ref.) 

Calling the canonicalize method with 
a key has the same effect as calling Sym- 
bol’s intern method: The canonical object 
is returned if present, otherwise a new 
one is created (using the factory) and re- 
turned. A second version of canonicalize 
takes an object as well as a key, with the 
understanding that this object is to be- 
come the canonical object if none is found 
in the table. 

In both cases, canonicalize begins by 
doing a cleanup, the details of which I'll 
examine shortly. It then proceeds much 
like Symbol.intern, looking up the key in 
the map and creating a new canonical ob- 
ject (or using the supplied one) if neces- 
sary. The only difference is that when the 
Weakkeference is created, its constructor 
is given the reference queue. 

The cleanup method, called at each in- 
vocation of canonicalize, dequeues ref- 
erences from the reference queue by call- 
ing the queue’s poll method, which returns 
null when the queue is empty. If a refer- 
ence is dequeued, that means the canon- 
ical object to which it refers is about to 
become garbage, so the key-value pair for 
that object should be removed from the 
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(continued from page 406) 

map. Because references are cleared be- 
fore being queued, there is no way to re- 
trieve the canonical object. So, if WeakRef- 
erences were used directly, cleanup 
wouldn't be able to determine which key 
to remove. The solution is to write a sub- 
class of WeakReference with an instance 
variable to hold the key. This subclass, 
WeakValue, is a private inner class of 
CanonicalTable. When a WeakValue is 
dequeued, its key can be extracted and 
used to remove the key-value pair from 
the map: 


map.remove(((WeakValue) r).key); 
The end result is a CanonicalTable that 


cleans up after itself, removing canonical 
objects that are eligible for reclamation. 


It's worth mentioning another applica- 
tion of weak references. Say you wished 
to associate additional data with some ob- 
jects. One approach would be to write a 
subclass with additional instance variables, 
but that wouldn’t be viable if you didn’t 
have control over the creation of the ob- 
jects. For example, you might want to as- 
sociate additional information with each 
thread of your program, even the threads 
created and used internally by the Java 
Virtual Machine. Java supports these 
thread-local variables with two java.lang 
classes, ThreadLocal and Inheritable- 
ThreadLocal. 

These classes could work by using a 
HashMap from threads to variables, ex- 
cept for the problem that a thread and its 
associated variables will never be garbage 
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collected as long as the thread is present 
in the table. As you know by now, the fix 
is to use weak reterences to hold the 
threads. Unlike CanonicalTable, weak ref- 
erences here must hold the keys of the map 
instead of the values. The JDK supplies 
such a data structure as java.util. Weak- 
HashMap. Its source code is required read- 
ing for students of references. 


Phantom References 

Like soft and weak references, phantom ref- 
erences have a get method, but it always 
returns nu//—you can never retrieve the 
contained object. (That explains the ghoul- 
ish name.) So phantom references are use- 
ful only in conjunction with reference 
queues. When you dequeue a phantom ref- 
erence, you know that an object is effec- 
tively garbage, so you can clean up after it. 
Specifically, no soft or weak references to 
the object exist— a phantom reference is 
enqueued only after all other references 
have been cleared— and the object’s 
finalize method, if any, has been called. 

Java’s finalization mechanism might 
seem to render phantom references use- 
less. An object’s finalize method is called 
just before the object is garbage collect- 
ed, to provide a chance for cleaning up. 
Moreover, the finalize method has access 
to the entire object, while a phantom ref- 
erence does not. 

Phantom references solve two problems 
with finalization. The first is that the /i- 
nalize method is called by a thread you 
know nothing about at a time you cannot 
predict. finalize methods have to be writ- 
ten very carefully to avoid unwanted in- 
teractions with your program. And ex- 
ceptions thrown by the finalize method 
are simply ignored, which, as you can 
imagine, makes debugging finalizers a de- 
light. The safest thing to do in a finalize 
method is to place the object on a queue 
for later processing at the program’s con- 
venience. This is just the functionality 
phantom references provide. 

The second problem with the finalize 
method is that there might not be one. If 
a class’s objects need to be finalized but 
the class writer has neglected to write a 
finalize method, phantom references can 
help. For example, say a class acquires an 
external resource — something outside the 
program, like a file descriptor or network 
connection— but neglects to provide a /i- 
nalize method to release it: 


class Leaker { 
int erToken = ExternalResource 
-acquire(); 
// no finalize method release 


Here, I’m imagining that the class for 
the resource returns an integer token 
representing the resource. If your code 
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(continued from page 48) 

creates Leakers, then you can subclass 
Leaker and write a finalize method. But 
if you don’t have control over object cre- 
ation, this solution isn’t available. How- 
ever, if you can access the external re- 
source token inside a Leaker, you can 
create a phantom reference to each Leak- 
er object and do the release yourself. 

A PhantomkReference itself can tell you 
nothing about the moribund object or 
what to do about it, so you must always 
create a subclass of PhantomReference 
that contains cleanup information. Here 
we hold the resource to be released: 


class Releaser extends PhantomReference { 
int token; 
Releaser(Leaker Ikr, ReferenceQueue q) { 
super(lkr, q); 
this.token = lkr.erToken; 


A crucial subtlety lurks in this code: 
The object of the phantom reference — 
the first argument to the superclass con- 
structor— must not be stored in an in- 
stance variable of the PhantomReference 
class. If it were, then there would be a 
standard pointer to the object— the one 
in the instance variable — and the ob- 
ject would never be eligible for garbage 
collection. 

Now each time you are given a Leaker, 
you create a phantom reference to it, as- 
sociated with a particular reference queue. 
The referencing object must itself be ac- 
cessible by a standard pointer— you don’t 
want it to get garbage collected before it 
can do its job— so you'll add it to a list. 
(I’m using Java 1.2 collections, but you 
can just as well use a Vector.) 


ReferenceQueue leakerQueue = new 
ReferenceQueue(); 


List releasers = new ArrayListQ); 


Leaker Ikr = ...; 


Releaser(kr, 
leakerQueue)); 


releasers.add(new 


Now, whenever you want, you can do 
some cleaning up: 


Releaser r = (Releaser) leakerQueue.pollQ; 
if (r != null) { 
ExternalResource.release(r.token); 
r.clear(); 
releasers.remove(r); 


Here, the queue is polled to obtain the 
next reference whose object is ready to 
be reclaimed. Then the data in that ref- 
erence is used to clean up. The refer- 
ence is removed from the list so it, too, 
can be garbage collected. 

The call to clear is the final nail in the 
coffin of the Leaker object contained in 
the reference; after that call, it will be 
reclaimed. Calling clear is not strictly 
necessary in this case, because remov- 
ing the Releaser object from the list ren- 
ders it unreachable, and when the 
garbage collector runs again, it will re- 
claim both the Releaser and the Leaker 
that it contains. But calling clear ex- 
plicitly can’t hurt, and it may hasten the 
demise of the Leaker. 

If this seems like a lot to go through 
for one call of a cleanup method, then 
you might be interested in my Cleanup 
class; see Listing Six. All you have to do 
is register a Cleanup.Handler with an 
object, and it takes care of the rest. Reg- 
istration involves creating an instance of 
Cleanup .Handler and calling the regis- 
ter method: 

Leaker Ikr = ...; 
final int token = lkr.erToken; 
Cleanup.register(Ikr, new Cleanup.Handler() 
| 
public void cleanup() { 
ExternalResource.release(token); 


I}) 


) 


It’s important that you don’t refer to /kr 
inside the cleanup method, for the same 
reason I discussed previously. If you do, 
/kr will never be garbage collected. 

The actual cleaning up can be done di- 
rectly, whenever you wish, by calling the 
doPending method of Cleanup: 

try { 
Cleanup.doPending(); 
} catch (Exception e) {...} 


The doPending method propagates any 
exceptions thrown by Cleanup.Handlers. 

Or you can start a thread to clean up 
continuously in the background, using the 
startBackground method. This thread uses 
the remove method of ReferenceQueue, 
which makes its calling thread wait until 
a reference is enqueued. 

What happens to exceptions thrown by 
Cleanup.Handlers called from the back- 
ground thread? In “Multithreaded Excep- 
tion Handling in Java” (Java Report, Au- 
gust 1998), Joe De Russo III and Peter 
Haggar suggest using an event listener- 
like mechanism for communicating ex- 
ceptions between threads. Here, I adopt 
a simpler, if less flexible, solution. Ex- 
ceptions are accumulated into a list, which 
may be obtained at any time by calling 
Cleanup. getExceptions. 


Conclusion 

References are obviously not for the ca- 
sual programmer. Leave consideration of 
references for late in the implementation 
phase of your project, and give prece- 
dence to abstractions like SoftObject, 
CanonicalTable, WeakHashMap, and 
Cleanup over naked references. When 
used correctly, references are a powerful 
tool for communicating with the garbage 
collector. 
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Listing One 
import java.lang.ref.*; 


public abstract class SoftObject { 


private SoftReference ref = new SoftReference(null); 


public Object get() throws Exception { 
Object result = ref.get(); 
if (result == null) { 
result = retrieve(); 
ref = new SoftReference(result); 
} 
return result; 


} 


protected abstract Object retrieve() throws Exception; 


Listing Two 


import java.io.*; 


public class FileObject extends SoftObject { 
private String filename; 
FileObject(String fn) { 
filename = fn; 
} 
protected Object retrieve() 


throws IOException, ClassNotFoundException { 
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ObjectInputStream in = 
new ObjectInputStream( 
new FileInputStream(filename) ) ; 


try { 


return in.readObject(); 


} finally ( 
in.close(); 


Listing Three 
import java.util.*; 


import java.lang.ref.*; 


class Symbol { 
private String name; 
Object value; 


private static Map table = new HashMap(); 
private Symbol(String nm) { name = nm; } 


String getName() { return name; } 


static Symbol intern(String name) { 
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Symbol s = (Symbol) table. get (name) ; 
if (s == null) { 
s = new Symbol (name) ; 
table.put(name, s); 
} 


return s; 


Listing Four 


static Symbol intern(String name) { 
Reference r = (Reference) table.get (name) ; 
Symbol s = null; 
if (r != null) 
s = (Symbol) r.get(); 
if (r == null |; s == null) { 
s = new Symbol (name) ; 
table.put(name, new WeakReference(s)); 
} 


return s; 


Listing Five 


import java.util.*; 
import java.lang.ref.*; 


/** This class is for maintaining canonical objects. */ 
public class CanonicalTable { 
private Map map = new HashMap(); 
private ReferenceQueue queue = new ReferenceQueue(); 
private Factory factory; 


public interface Factory { 
public Object create(Object key); 

} 

public CanonicalTable() {} 

public CanonicalTable(Factory f) { 
factory = f; 

} 

public synchronized Object canonicalize(Object key) { 
return canonicalize(key, null); 


} 
public synchronized Object canonicalize(Object key, Object o) { 
cleanup (); 
Object value = map.get (key); 
if (value != null) 
value = ((WeakReference) value) .get(); 
if (value != null) 
return value; 
else { 
if (o == null) 
o = factory.create(key) ; 
map.put(key, new WeakValue(key, 0, queue)); 
return 0; 
} 
} 
public synchronized Object get(Object key) { 
cleanup () ; 
Object value = map.get(key); 
if (value != null) 
return ((WeakReference) value) .get(); 
else 


return null; 
} 
private void cleanup() { 
Reference r; 
while ((r = queue.poll()) != null) 
map.remove(((WeakValue) r).key); 


} 
IIIT TT/ 
private static class WeakValue extends WeakReference { 
Object key; 
WeakValue (Object k, Object 0, ReferenceQueue q) { 
super(o, q); 
key = k; 


° e ° 
Listing Six 

import java.util.*; 
import java.lang.ref.*; 


/** A class for simplifying the use of phantom references. */ 
public class Cleanup { 


// Doubly linked list of CleanupReferences, with an empty header. 


private static CleanupReference list = new CleanupReference(); 
private static ReferenceQueue queue = new ReferenceQueue(); 
private static Thread backgroundThread = null; 

private static ArrayList exceptions = null; 


public interface Handler { 
public void cleanup() throws Exception; 
} 
/** Register a cleanup handler with an object. */ 
public static void register(Object 0, Handler h) { 
synchronized (list) { 
CleanupReference r = new CleanupReference(o, h); 
r.linkAfter (list) ; 
} 
} 
/** Perform all pending cleanup operations. */ 
public static void doPending() throws Exception {( 
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Reference r; 
while ((r = queue.poll()) != null) 
((CleanupReference) r).cleanup() ; 
} 
/** Start a thread to do cleanup in the background. */ 
public static synchronized void startBackground() { 
if (backgroundThread != null) 
return; // already running 
backgroundThread = new Thread(new Runnable() { 
public void run() ( 
while (!Thread.interrupted()) { 
try { 
CleanupReference r = (CleanupReference) queue. remove(); 
r.cleanup(); 
} catch (InterruptedException e) { 
// do nothing; loop will end 
} catch (Exception e) { 
addException(e) ; 
} 


} 
DE 
backgroundThread. setPriority(Thread.MIN_PRIORITY) ; 
backgroundThread. start () ; 
} 
/** Stop the background cleanup thread. */ 
public static synchronized void stopBackground() { 
if (backgroundThread != null) { 
backgroundThread.interrupt () ; 
backgroundThread = null; 
} 
} 
/** Get a list of all exceptions generated by cleanup 
calls in the background thread. */ 
public static synchronized List getExceptions() { 
ArrayList result = exceptions; 
exceptions = null; 
return result; 
} 
private static synchronized 
void addException(Exception e) { 
if (exceptions == null) 
exceptions = new ArrayList(); 
exceptions.add(e) ; 


) 
SILLTTLTTTTLTTTTTT TTT T TTT TTT TAAL 
private static class CleanupReference 
extends PhantomReference ( 
private Handler handler; 
private CleanupReference next, prev; 


CleanupReference() { // Used only for head of linked list. 
/ Queue is never garbage; ensures 
// no enqueuing. 
super(queue, queue) ; 
next = prev = this; 

} 

CleanupReference(Object o, Handler h) { 
super(o, queue); 
handler = h; 

} 

void linkAfter(CleanupReference c) { 
this.prev = c; 
this.next = c.next; 
c.next.prev = this; 
c.next = this; 


} 
void cleanup() throws Exception { 
try { 
handler.cleanup() ; 
} finally { 
this.clear(); 
synchronized (list) { // unlink 
this.prev.next = this.next; 
this.next.prev = this.prev; 
} 
} 
} 
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Python Server Pages: - 





A portable ASP-like 
server-side 
scripting engine 


Kirby W. Angell 


n the first installment of this two-part 

article (see “Python Server Pages: Part 

I,” DDJ, January 2000), I introduced 

Python Server Pages (PSP), a JPython 
and Java Servlet-based server-side script- 
ing engine. To recap, I created PSP to al- 
low developers familiar with Microsoft’s 
Active Server Pages (ASP) development 
to write HTML pages with a script em- 
bedded in them. The page containing the 
script is executed on the server and the 
results are sent to the user’s browser. You 
could, of course, use something like Java 
Server Pages (JSP) to do this, but when 
I created PSP, JSP was not available. Be- 
sides, Java strikes me as more of a sys- 
tem programming tool, not a scripting 
language. 

PSP uses JPython, the Java-based ver- 
sion of the Python programming language, 





Kirby is a Microsoft Certified Software De- 
veloper and a contributing author to the 
Quick Python Book (Manning, 1999). He 
can be contacted at kwangell@hotmail.com. 
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Part II 


as its scripting language (see “Examining 
JPython,” DDJ, April 1999). In Part I, I 
looked at how HTML pages with embed- 
ded scripts are translated into compilable 
JPython code by a Python-based code 
generator. That’s only a small part of the 
work involved, however. This month, I'll 





examine the Java Servlet side of PSP, 
which contains all of the code to compile 
and execute the JPython code in response 
to a request from a user. 

PSP source code and related files are 
available electronically at http://www 
.ciobriefings.com/psp.htm or from DD/J 
(see “Resource Center,” page 7). 

Figure 1 shows what happens when 
users request web pages processed by a 
Java Servlet engine. The servlet engine in- 
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stalls a filter in the web server so it has a 
chance to process any user requests. When 
the servlet engine sees a request it is in- 
terested in, the engine processes the re- 
quest and sends the output back to the 
web server where it is sent back to users, 
by registering the engine whenever the 
user requests a file with that extension 
(such as helloworld.psp). 


PSPServiet 

The PSPServiet class (Listing One) is the 
entry point for the servlet. The servlet 
engine loads it when the first request is 
received that is to be handled by the 
servlet. (Some servlet engines let you 
configure the server so your servlet is 
loaded when the engine starts. This saves 
users from suffering through the load 
time.) In this case, PSPServlet engine 
contains a static code block that is ex- 
ecuted when the class is loaded. The 
static block sets up the services used by 
PSP for the rest of its execution. Because 
Java servlets need to be multithreaded, 
it was important to make sure this ini- 
tialization code was executed before 
anything else. 

The purpose of the initialization code 
is to load and configure the JPython com- 
piler and execution environment. More 
specifically, this section of code: 


1. Loads the Python sys module, which 
provides access to the Python module 
search path used by JPython. 

(continued on page 57) 
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(continued from page 54) 

2. Updates the module search path to in- 
clude the directory where PSP is in- 
stalled. 

3. Loads the Python code- generation mod- 
ule (discussed in Part I of this article). 


Listing One also shows the PSPServilet 
class’s interaction with the PSP class (see 
Listing Two). PSP class is a static class that 
provides several utility functions to the rest 
of the application. In this case, the PSP,psp- 
Root property holds the location of the PSP 
application. This is a nifty trick that I learned 
from looking at the source code for JPython 
(like PSP, JPython’s source code is freely 
available, so free in fact that it is installed 
when you install JPython). The PSP method 
findRoot searches through the Java class- 
path looking for psp.jar, which is the file 
that contains the PSP class files. If some- 
one else packages his classes into psp.jar, 
then I’m toast, but this is an easy way to 
find out information that Java does not nor- 
mally provide. 

You need to know where PSP is in- 
stalled, because this is where the Python 
module containing the code generator 
is. I could have put the code generator 
in the normal JPython library directory 
(an early version of PSP did that), but 
after installing PSP on several servers I 
found this to be a hassle to maintain. Ev- 
ery server had a different location for 
JPython, which made it hard to remem- 
ber where the file was when I wanted 
to update it. Now the code generator 
(cg.py) is placed in the directory where 
the “psp.jar” file is, making it a trivial 
matter for PSPServlet to find and load. 
This also makes it easy to update when 
new versions of PSP are released. 

The major work performed by PSP- 
Servlet is concentrated in the service 
method, which is called by the servlet 
engine whenever a request is made that 
the servlet should handle. In this case, 
PSPServiet.service is called when a web 
browser requests a file ending in “*.PSP” 
from the server. PSP’s service method is 
straightforward and basically looks for 
some other object to shift the work to— 
a PSPAppContext object. Python Server 
Pages use application workspaces to 
keep track of PSP pages and allow them 
to interact. Each workspace is managed 
by an instance of PSPAppContext (PSP- 
AppContext.java is also available elec- 
tronically). The remaining methods of 
PSPServiet work together to determine 
which application context can service 
the request. 

The getApplication method of PSPServilet 
is responsible for looking through the 
cache for the application. New applica- 
tion contexts are created by looking at the 
physical path of the page being loaded. 
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Figure 1: What happens when a user requests a web page processed by a Java 


Servlet engine. 
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If the path of the page also contains a file 
called “global.psa,” a new context is cre- 
ated for this application and is tied to this 
path. If no global.psa is found, the /oad- 
Application method searches the parent 
directory for the file. This searching goes 
on until a global.psa file is found or the 
root of the web server is encountered. 
If the root is found, then the page is as- 
sumed to belong to the default applica- 
tion context. This is exactly the way that 
ASP looks for its global.asa file. This also 
means that a PSP application can con- 
tain subdirectories and still belong to the 
same application. It is the global.psa file 
that controls where an application be- 
gins or ends. 

So ends the process of determining 
which instance of PSPAppContext can han- 
dle the request. The entry point to this 
process, getApplication, is a synchronized 
method. Web servers are multithreaded to 
handle multiple user requests; Java servlets 
should also be prepared to serve multi- 
ple requests. Synchronized methods are 
used in a few selected instances to pro- 
tect code that is not otherwise thread safe. 
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The JPython interpreter (available from 
the PythonInterpreter class) implements 
the Python language within a single en- 
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Graphics System i 2 aang anoammamaneneennanaaRE other. In the movie Ghostbusters, the 


team wants to know why they shouldn't 
cross the streams of the unlicensed nu- 
clear accelerators on their backs. Harold 
Ramis’s rather vague answer is, “That 
would be bad.” I don’t know what hav- 
ing all of the pages within multiple un- 
related applications in the same envi- 
ronment would be like, but it definitely 
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Pages within an application can ex- 
change data while applications are pre- 
vented from interacting (within the 
a) JPython interpreter at least) with each 
me Le other. 

S It is amazing how simple Python Serv- 
er Pages started out to be. Getting JPython 
to execute a block of code is as simple as 
a call to the Pythoninterpreter method exec 
(see Listing Three). In fact, the original 
version of PSP had only one class and 
most of the work was handled within the 
service method of PSPServlet. 

If a function called Application_Start 
exists in global.psa, it is called as soon as 
the PSPAppContext object is created. You 
can use this method to set up any services 
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required by the PSP application. I have 
used this method to initialize data struc- 
tures, log into databases, clean up tem- 
porary files, and similar activities. Any vari- 
ables you create in global.psa are available 
to any of the executing PSP pages. PSP 
applications— like the Java servlets they 
are based on— must support multiple si- 
multaneous requests. This means that any 
data structures you provide in global.psa 
should be read only, or use JPython’s syn- 
chronization features to control access. 

One of the more complicated activities 
performed by PSPAppContext is to get the 
given page translated into syntactically cor- 
rect JPython and execute the code. The 
processPage method is the starting point 
for this process and is called by 
PSPServlet’s service method. Like applica- 
tion contexts, PSPAppContext keeps a 
cache of the pages being executed. Why 
cache the pages? Here are the steps in- 
volved in executing a single PSP page 
from scratch: 


— 


. Translate the virtual path of the page 
into a physical path (/spam/display- 
menu.psp into c:\inetpub\wwwroot\ 
spam \displaymenu.psp). 

2. Open the physical page and process it 

into a real JPython script. 

3. Compile the script into Java bytecodes. 

4. Execute the page. 


Reading the page, translating it into 
JPython, compiling it, and finally execut- 
ing it are expensive processes. Once the 
page has gone all the way through step 
3, the compiled bytecodes (really a Py- 
Code object from JPython) are stored in 
the g_scripts hashtable. The timestamp of 
the original script file is stored in g_dates. 
If you update the actual page, the script 
is loaded and recompiled the next time it 
is accessed. This makes testing and up- 
dating your PSP applications more con- 
venient. Every time you update a page, 
Python Server Pages makes the changes 
immediately available. On the other hand, 
checking the timestamp each time a page 
is executed is a little expensive itself. Per- 
haps a future version will contain a con- 
figuration option to turn this feature off. 

The methods getPythonScript, load- 
PythonServerPage, and compile are in 
charge of maintaining the cache. When 
attempting to execute a page, the first stop 
is getPythonScript. If the page cannot be 
located within the cache (or has been up- 
dated since being cached), the load- 
PythonServerPage method is called. After 
a wait, this method sends the page through 
the code generator, which stores its re- 
sults (a JPython script) in a file of its own 
choosing. The filename is returned to /oad- 
PythonServerPage, which in turn passes it 
to the compile method. This simple 
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method loads the file into memory, then 
calls the standard JPython method 
__builtin__.compile to translate the page 
into Java bytecodes. Assuming all this goes 
as planned, the compiled page is ready 
for execution. Upon looking through the 
code, you will notice that PSP does not 
limit its interaction with JPython to the 
PythonInterpreter class. PythonInterpreter 
provides most of the functionality need- 
ed to integrate JPython in simple appli- 
cations; however, more advanced appli- 
cations require tighter coupling to JPython. 
I have found JPython’s __builtin__ and 


Py modules to be useful. There is no doc- 
umentation for these modules, so expect 
to spend time with the JPython source 
code if you choose to integrate your Java 
application this tightly with JPython. 
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In the actual source code, there are sev- 
eral versions of PSPAppContext’s exec 
method, but PSPAppContext.java (available 
electronically) shows the main method. 
This is a simple method because most of 
the work is done in ExecContext. The Exec- 
Context class is in charge of setting up the 
execution environment when using JPython 
to execute a single page within the appli- 
cation. To perform the magic of keeping 
each application’s namespace separated, 
PSPAppContext and ExecContext use 
JPython-based dictionaries to hold all of 
the objects related to the application. 
PSPAppContext actually sets this up in its 
constructor when it creates a copy of the 
dictionary it was passed by PSPServiet. Dic- 
tionaries are basically hashtables that 
JPython uses to store and look up variable 
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Figure 2: How PSP objects interact within the servlet. 
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names, functions, classes, objects, and any- 
thing else created by executing JPython 
scripts. The beauty of JPython is that when 
you execute a script, you can provide your 
own dictionary to be used for this purpose. 
ExecContexts purpose is to place various 
objects into the namespace prior to a script 
being executed. 

So what does ExecContext put into the 
execution environment? A script that can’t 
find out about its environment would not 
be very useful. A Java servlet has access 
to Request and Response objects that pro- 
vide access to information coming into 
the servlet (Request) and information go- 
ing back out of the servlet (Response). So 
that the scripts can access these objects, 
ExecContext places them into the dictio- 
nary used by PSPAppContext when exe- 
cuting a script. Unfortunately, this must 
be done each time a script is executed, 
because the Request and Response objects 
are only valid during a single call to the 
servlet’s service method. 

Once the namespace is completely 
configured, the exec method has little to 
do. It has a namespace and block of Java 
bytecodes to execute, which it does. Fig- 


#include <iostream.h> 
#include <stdlib.h> 
#define Random() rand 


ure 2 shows how these Python Server 
Pages objects interact within the servlet 


int main() 


cout << "hello 


"<< +4Nn << 


} while( Random() != 0 ); 


return 0; 


} 


ponsens EEA EERE SERRE porno 
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The programmer is producing what he 


Seen 


ber of messages. 


However it's taking a long time. What's up? Call if you need a hint or visit our web 
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subscripts. Almost 100 standard functions are 
rigorously checked. 


Version 7.5 pushes the envelope of lint 
detection still further. User-defined semantic 
checking of functions offers a unique language 
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relationships, and return values. There are 75 
new messages including cradle-to-grave 
checking of pointers and the latest admonitions 
and perorations of Scott Meyers. 


Plus Our Traditional C/C++ Warnings: 
Uninitialized variables, inherited non-virtual 
destructors, strong type mismatches, 
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Full C++ Support - PC-lint for C/C++ 
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Microsoft C/C++. 


PC-lint for C/C++ $239 
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Runs on Windows 95, Windows NT, MS- 
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to produce the output that is sent to the 
browser. 

The code generator translates most ev- 
erything into a call to a __write__ method. 
There is no such method in JPython, but 
the code generator expects you will de- 
fine one before you try to use the fruits 
of its labor. In this case, I want all of the 
output of the pages to go to the user’s 
browser. PSPAppContext accomplishes this 
by routing all calls to ___write__ to the Re- 
sponse.write method. The code in PSP- 
AppContext sets all this up by creating and 
compiling a __write__ method into the 
namespace that will be used by any exe- 
cuting pages. 


Pythonizing 

In the original version of PSP, the Re- 
quest and Response objects provided by 
the servlet engine were passed directly 
to the executing script. As soon as I start- 
ed writing PSP applications, however, I 
hated that most of the page looked and 
worked like Python code while these ob- 
jects were clearly Java. For instance, this 
is how you would get a parameter 
passed to your page using the servlet 
version of Request: 


parm = Request.getParameter("myparm") 


Because the parameters are really just a 
hashtable of values, it is a shame that I 
couldn’t use Python’s normal mode of ac- 
cessing keyed values— the dictionary. Con- 
sequently, PSP is now composed of sever- 


al objects (PyRequest, Pykesponse, PyParams, 


and so on) that are wrappers around their 
servlet counterparts. In PSP, the aforemen- 
tioned section of code now looks like this: 


parm = Request.params["myparm"] 


This may not look like a huge change, but 
to a Python programmer it means every- 
thing in the world. All of the major ob- 
jects passed to the PSP pages have been 


“Pythonized” in some way to make them 
more palatable to Python programmers. 


Conclusion 

With PSP’s foundation in Java, applica- 
tions written using PSP are portable to any 
environment where Java and Java servlets 
can be found. From a web developer’s 
view, especially an Active Server Pages 
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trained developer, PSP offers a cheap way 
to implement server-side scripting on a 
platform that is portable to virtually every 
Java platform available. To make you feel 
more at home, PSP provides many of the 
same objects and services available to ASP 
applications. 
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Listing One 
public class PSPServlet extends HttpServlet { 
PyObject m_cgEngine = null; 
static f{ 

// Initialize the JPython interpreter 
PSP.interp.exec( "import sys" ); 
// Put our installation directory at the beginning of the JPython 
// search path. That means our modules get loaded before anything else, 
// keeping us away from any nasty module colisions. 
PSP.interp.set( "PSPSearchPath", Py.java2py(PSP.pspRoot) ); 
PSP.interp.exec( "sys.path = [PSPSearchPath] + sys.path" ); 
// Load up the JPython based code generator module used by PSP 
PyObject cg = __builtin__.__import__( new PyString("cg") ); 
PyObject cgVersion = cg.__getattr__( "Version" ); 


String root; 


if (jpy == -1) { 
return null; 
} 


int start 


} // findRoot 


static final String GLOBAL_SCRIPT = "global.psa"; 


// finds the directory where psp is installed 
public static String findRoot() { 


// If find psp.jar in class.path 

String classpath = System. getProperty("java.class.path") ; 
if (classpath == null) return null; 

int jpy = classpath. toLowerCase().indexOf("psp.jar"); 


classpath. lastIndexOf(java.io.File.pathSeparator, jpy)+1; 
return classpath.substring(start, jpy); 


// adds a new application context to the cache 


// Create an instance of the code generator that we can use 
String cachedir = PSP.pspRoot + "cache"; 
PyObject engineClass = cg.__getattr__( "cgEngine" ); 
PyObject engine = engineClass.__call__( Py.java2py(cachedir) ); 
PSP.codeGenerator = engine; 
} 
public void service(ServletRequest req, ServletResponse res) 
throws ServletException, IOException { 
String psp = ((HttpServletRequest) req) .getServletPath() ; 
// Get an application object to satisify this request. 
PSPAppContext ac = getApplication( (HttpServletRequest) req ); 
ac. processPage ( 
psp, (HttpServletRequest) req, 
(HttpServletResponse) res ); 
} // service 
synchronized PSPAppContext getApplication( HttpServletRequest req ) { 
String psp = req.getServletPath(); 
psp = psp.replace( '/', File.separatorChar ); 
psp = psp.replace( '\\', File.separatorChar ); 
// Get application name 
File pspFile = new File( psp ); 
String appName = pspFile.getParent(); 


apps.put( name, app ); 
J 


} 


apps = new Hashtable() 
System. gc(); 
} 


Hashtable ht 
Enumeration e 


Hashtable htPages 
Enumeration scripts 


PyString script 
PyLong date 


PSPAppContext app = (PSPAppContext) PSP.getApp( appName ); 
if ( app == null ) 
return loadApplication( appName, req ); 
else 
return app; 
} // getApplication 
PSPAppContext loadApplication( String name, HttpServletRequest req ) { 
File pspFile = new File( name ); 
String appName = name; 
// Look for a global script file indicating the 
// base directory of an application. 
while ( appName != null ) { 
String globalScript = req.getRealPath( 
appName + File.separatorChar + PSP.GLOBAL_SCRIPT) ; 
File £ = new File(globalScript) ; 
if ( f.exists() ) 
break; 
pspFile = new File( appName) ; 


} // While 
} // while 


} // getAppStats 


if ( cookies null ) 


Hashtable ht 
for( int i 
Cookie cookie 


+ ff for 


appName = pspFile.getParent(); } // makeCookies 
} // while } 
if ( appName == null ) 

appName = new String( "" + File.separatorChar) ; ice 
PSPAppContext app = (PSPAppContext) PSP.getApp( appName ); Listing Three 


if ( app == null ) { 
app = new PSPAppContext ( 
(PyStringMap) PSP.interp.getLocals(), tr, appName ); 
PSP.addApp( appName, app ); 


import org.python.core.*; 


} 

// If we didn't find this application in the same 

// directory where we started 

if ( !appName.equals(name) ) { 
// Put this application object in the cache 
// under the directory name where we finally found it. 
PSP.addApp( name, app ); 

} 


System. 
interp. 
interp. 


Enumeration dates 
while ( scripts.hasMoreElements() ) { 


n 


static public synchronized void addApp( String name, PSPAppContext app ) { 


// gets a cached application context from the cache 
static public synchronized PSPAppContext getApp( String name ) { 
return (PSPAppContext) apps.get( name ); 


// clears the application cache for the servlet 
static public synchronized void clearCache( boolean newValue ) ( 


// returns a python dictionary with statistics for the loaded apps 
static public synchronized PyDictionary getAppStats() { 
new Hashtable( apps.size() ); 
apps.elements() ; 
while ( e.hasMoreElements() ) { 
PSPAppContext app = (PSPAppContext)e.nextElement () ; 


new Hashtable( app.g_scripts.size() ); 
app.g_scripts.keys(); 
app.g_dates.elements(); 


ew PyString( (String)scripts.nextElement() ); 


new PyLong( ((Long)dates.nextElement()).longValue() ); 
htPages.put( script, date ); 


ht.put( Py.java2py(app.g_appName), new PyDictionary(htPages) ); 
return new PyDictionary( ht ); 


// generate a PyDictionary containing HTTP cookies 
static public PyDictionary makeCookies( Cookie[] cookies ) { 


return new PyDictionary() ; 

new Hashtable( cookies.length ); 

®; i < cookies.length; i++ ) { 

cookies [i] ; 

ht.put( Py.java2py( cookie.getName() ), 
Py.java2py( cookie ) ); 


return new PyDictionary( ht ); 


import org.python.util.PythonInterpreter; 


public class SimpleEmbedded { 
public static void main(String []args) throws PyException { 
PythonInterpreter interp = new PythonInterpreter() ; 


out.println("Hello, brave new world") ; 
exec("import sys"); 


exec("print sys") ; 


new PyInteger (42) ); 


interp.get ("x"); 


return app; interp.set("a", 
} // loadApplication interp.exec("print a"); 
} interp.exec("x = 2+2"); 
PyObject x = 
Listing Two System.out.println("x: "+x); 


public class PSP { 


static PythonInterpreter interp = new PythonInterpreter() ; 

static PyObject codeGenerator = null; 

static Hashtable apps = new Hashtable(); // Cache for Application Contexts 
static String pspRoot = findRoot(); 
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System.out.println("Goodbye, cruel world") ; 
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Andrew Dwelly 


on Bentley once asked the question, 
“Why can’t we read programs in the 
same way that we read novels or mag- 
azine articles?” One answer is that, 
with the exception of comments, programs 
are not primarily meant for human con- 
sumption — they’re written for execution 
by computers. Of course, over the years, 
we developed tricks to make life easier 
for ourselves, including meaningful vari- 
able names, the use of abstraction, and so 
on. Still, reading code remains an effort. 
One reason for this difficulty is that code 

is ordered in the way required by the pro- 
gramming language, rather than the order 
in which we create it and think of it. Even 
when working with a system as produc- 
tive as Java, writing code usually consists 
of adding some code to one class, then 
more code to another class — building up 
to a complete working system. Explain- 


Andy is a computer consultant who lives 


on Guernsey in the Channel Islands. He 
can be contacted at adwelly@hotmail.com. 
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ing code to someone often follows a sim- 
ilar route; you trace an execution thread 
from one class to another until you can 
see how a complete task is accomplished. 

In the early 1980s, Donald Knuth pop- 
ularized a solution to the problem of un- 
derstanding code with his concept of “lit- 
erate programming.” As Figure 1 illustrates, 





Knuth’s WEB system takes a single source 
document containing code and explana- 
tory text laid out in the order you would 
describe a program. Running a WEB doc- 
ument through Knuth’s Tangle program 
extracted the code parts and assembled 
them into a syntactically correct (though 
not very readable) program. If you run 


Dr. Dobb’s Journal, February 2000 





the same source through his Weave pro- 
gram (available electronically; see “Re- 
source Center,” page 7), you get TeX 
source, which can create a beautifully 
typeset document. TeX produces a device- 
independent print format called “DVI,” 
which needs one further step to translate 
it into instructions for a particular print- 
er language such as PostScript. In short, 
in the hands of a skilled practitioner, a 
literate program can be read as a well- 
structured essay. 

Sun’s javadoc utility (part of the JDK) is 
similar to Weave, although it targets Java. 
Javadoc works with some simple Java com- 
menting conventions to produce HTML 
documents that act as a reference to a col- 
lection of classes. But reading a reference 
manual is not necessarily the best way of 
understanding how to use a complicated 
API; hence the huge number of “how to” 
programming books found in bookstores. 
The documents produced by Weave are 
anything but reference manuals — they are 
often works of art and can be read for en- 
joyment as well as explanation. 

While Knuth’s WEB system produced 
output via TeX and worked for Pascal 
programs, Marius, the system I present 
here, implements some of Knuth’s ideas 
using Java as its programming language, 
with HTML as the output. In the process, 
I also leverage the power of XML. 

Marius actually refers to my literate pro- 
gramming system as a whole— both the 
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STN SAST ETT Weave and the Comb programs (also avail- 

HPUX / Linux able electronically both in source and 

And more... compiled form). The code to Weave and 
Comb is freely available and published 

Version Control Support under the Open Source Artistic License. 


Intersolv PVCS 
Microsoft SourceSafe 
MKS Source Integrity 


Assuming you have tagged Java with 
XML tags, Marius lets you use Weave to 


Burton System TLIB generate literate HTML, then use Comb to 

generate literate Java. More specifically, 
Makefile Compatible you create a source document containing 
Intersolve PolyMake both Java and explanatory text. Weave cre- 
Intersolv Configuration ates formatted output in HTML, then 
Builder Comb extracts the Java, reassembles it, 
Microsoft NMake and produces a normal looking Java pro- 
UNIX makes gram (unlike Knuth’s original system that 


T he complexity of software development in today's seca deliberately produced obfuscated Pascal). 
world demands a make utility that won't waste Free lifetime To illustrate, I include a large example lit 


: bite file that describes a minor problem I found 
your time. OpusMake is fast (up to 4x's faster than ee ee sar guna tee class. io eeneie Gave tiles 


other makes), powerful (our underlying technology Money-back Guarantee and an out.html file, you'll need Marius 
; . Get your money back for and the XML jar from Sun. You execute 
relieves many performance hurdles) and versatile (an up to 60 days. Hhemith: 


OpusMake makefile can be run on many different OS's) Sous Stewie we java Comb TreeExample.lit 
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; ; a Tel: 800-248-6787 ; 
makes it! Because you don't want to run in circles 415-495-9703 sets of tags that can be specially defined 


Fax: 415-485-9704 for the purpose. There are a number of 
programmer tools (such as parsers) that 
make it easy to work with XML. In the 
case of Marius, the source document has 
to be marked up both into areas that rep- 
resent code, and those that represent ex- 
planation. XML is the ideal way to ac- 
complish this. Using Sun’s free XML parser 
(available at http://java.sun.com/xml/) 
means that projects such as Marius can be 
created in days, rather than months. 


java Weave TreeExample.lit 


getting your software to the world. 
©1998 
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point in the document. 
piace You use Weave to take the source doc- 
rom ument and produce HTML that can be 
Distribution read with an ordinary web browser. As 
Features Figure 2 illustrates, you then use the Comb 
License program (available electronically) to take 
Manager the source document and produce syn- 
Year 2000 tactically correct Java. Listing One, for in- 
stance, is part of a Marius source docu- 
ment describing a simple benchmark 
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code. The resulting HTML pages are for- 
matted in a way that follows the XML close- 
ly. Weave only departs from this when en- 
countering a <CLASS> tag when it outputs 
a small fragment of text that not only re- 
sembles a Java class declaration, but also 
includes a hypertext link to the actual 
source so it can be seen in its original form. 

The code for the NFIB algorithm, which 
is really the focal point of that particular 
example, was discussed prior to the in- 
troduction of the Benchmark class. In fact 
in the complete example, Benchmark also 
presents a fairly substantial user interface 
that occupies far more of the code in per- 
centage terms than the actual algorithm. 
Although this code must appear in the fi- 
nal Java output, it is captured in <CODE> 
tags where the attribute ELIDE (which de- 
faults to F) has been set to T. Weave sim- 
ply leaves this out of the HTML. 

The ability to express code like this is 
the main advantage of literate program- 
ming. Cumbersome but irrelevant details 
can be left until some later point, or re- 
moved altogether. Small, but vital, points 
can be emphasized. You become an au- 
thor carefully drawing attention to one 
area, while drawing a veil over other pieces 
of the code. Marius breaks up <NARRA- 
TIVE> tags into paragraphs <P>, and in 
fact any HTML tags legal within <P> can 


 aoEEOEOEOEOEEeEeEeEeEeEeEeeeeeee nn eens ana nas 


be included using the <![CDATAT ]]> tag. 
Text that appears inside a CDATA tag is 
ignored by the XML parser and is simply 
copied to the output by Weave. As a re- 
sult, you can include the usual range of 
HTML tricks including images, different 
colored fonts, Java applets, and so on. In 
fact, you can pretty much do anything nec- 
essary to make yourself and the code you 
are presenting clearer to your audience. 
A Marius source document keeps code 
fragments, classes, and explanation to- 


gether in a single source document. A sin- 
gle Marius source can contain several com- 
plete programs. As a result, explanation 
and code tend to remain synchronized 
with each other. The common experience 
of finding out-of-date code in a manual 
is eliminated. 

Comb is responsible for reassembling 
working Java source code from the Mar- 
jus source document. This is handled as 
a three-phase process after the XML pars- 
ing is complete. 





Figure 1: Knuth’s WEB system. 





Figure 2: The Marius system. 
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e In the first pass, Comb builds up a 
database of classes and code fragments. 
It records the NAME, EXTENDS, and 
WEIGHT attributes of each fragment and 
the NAME, EXTENDS, IMPLEMENTS, and 
PACKAGE attributes of each class. It also 
recognizes a VISIBILITY attribute that de- 
faults to Public and an ABSTRACT at- 
tribute that defaults to F. The EXTENDS 
attribute in a code fragment simply means 
that the code fragment is part of anoth- 
er fragment or class. The EXTENDS at- 
tribute in a class tag has the usual Java 
inheritance meaning; see Figure 4. 

The second pass visits each fragment in 
the database in turn, in no particular or- 
der. If the fragment extends another frag- 
ment, the extended fragment is informed. 
Thus, Figure 4 becomes Figure 5. 

The third pass can now easily perform 
a depth-first tree traversal outputting 
Java source as it goes. So that code frag- 
ments can be forced to appear in the 
correct order (since you want readable 
code) the WEIGHT attribute is used to 
sort nodes at the same level, fragments 
with a lighter weight appear earlier than 
fragments with a heavier weight. 


Unlike Knuth’s original work, which 
was meant for printing, Marius outputs 
for a widely distributed hypertext system. 
I occasionally want a hypertext link to re- 
fer to some class or code fragment with- 
in the Marius source file. Weave marks 





Marius: an example 
1 The Nfib algorithm 


is called “Benchmark.prelude.’ 


each fragment with an anchor in the 
HTML so it can be referred to easily, and 
there is also a <REF> tag that is legal to 
use within <P>s, as you can see from the 
example. 

It's common to find both import state- 
ments and javadoc comments between a 


Marius lets you 
generate literate 
ATML, then 


literate Java 





package statement for a class and the ac- 
tual declaration of the class itself. How- 
ever, any code fragment that extends a 
class appears within the class itself. To get 
around this problem, whenever a new 
class is introduced, Comb also creates a 
bookmark in its database for fragments 
that must end up in this position; if the 
class is called “Benchmark,” the bookmark 
’ So to in- 





Ni&b ts our next example of benchmarking algonthms, it is closely related to the Fibonacci senes, and gives a broad idea of 
the speed of procedure calls m the mterpreter. It rehes on ws rather pecubar property that the mteger that it retums is the 
same as the number of procedure calls needed to generate the number. 


The Fibonacci senes starts 1, 1, 2, 3, 5,... each element is the sum of the two before. The NFIB algonthm adds 1 at each 
stage giving the senes 1, 1, 3, 5, 9,... Thus counting from 0, we can see that nfib(3) = 5. Here is the code for the method: 


public int nfibtint ni { 
if tm < 2) ¢ 
return 1; 
} mise { 


return nfibim - 1) + nfibimn - 2) + 4; 


} 


T's quite easy to see that nfib(2) takes exactly three calls to nfib to calculate it’s return value of 3. To get a very rough idea of 
procedure call speed, sumply divide the time taken to calculate a relatwely large nfib number (¢ g, nflb(37)) by the retum 


value. 


The algonthm can be found as part of the benchmark class defined here, which mcludes a user mterface 


package com. cedi llasogt .benchearks; 


public class Benchmark extends JFrame implements ActionListener it 
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Figure 3: Bro 
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wser view of Listing One. 
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clude the user-interface packages in the 
example, you might see Listing Two in 
the Marius source. 

Comb outputs code for each class in 
its own file following the conventions ex- 
pected by Java compilers— if the class 
is X, its source is to be found in the file 
X.java, but it ignores directory structure. 
Weave simply outputs to the file out.html. 
These design decisions were taken to 
keep Comb and Weave as simple and 
flexible as possible. It is common to run 
both programs with a short script to re- 
name the output of Weave to something 
more appropriate, move the Java files to 
their appropriate directories, then run the 
Java compiler. 

The code that appears in the fragments 
is (with two unfortunate exceptions) much 
as you would expect to find it in an or- 
dinary Java source file. Using XML to cre- 
ate the Marius source gives many advan- 
tages but it has a problem with the “<” 
character that is used to signal the begin- 
ning of a new tag. Thus, in our example 
when the code was if(m < 2) /, you actu- 
ally have to write if (n Git; 2) {. There’s a 
similar problem with the “&” character that 
has to be expressed as Gamp;. So, in Mar- 
ius source, you will sometimes see ex- 
pressions like: 


if (object != null &amp;&amp; object.some 
CallO) { 


Admittedly, this looks strange. Fortunate- 
ly, if you make a mistake and use the 
more natural < or &&, the parser produces 
a fairly meaningful error message that en- 
ables you to track down the problem pret- 


ty quickly. 











Figure 4: First pass using the 
EXTENDS attribute. 











Figure 5: Second pass using the 
EXTENDS attribute. 
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(continued from page 60) 

The tags that can legally appear in the 
Marius source file are controlled by a 
Document Type Declaration (DTD). The 
DTD that is currently used with Marius 
states that a conforming and valid Mar- 
ius document contains an optional 
<COPYRIGHT> tag followed by one or 
more <SECTION> tags. The text ap- 
pearing between the copyright tags is 
placed at the beginning of every Java file 
output by Comb. 


<!ELEMENT LIT (COPYRIGHT?, SECTION+)> 
<!ELEMENT COPYRIGHT (#PCDATA)> 
<!ELEMENT SECTION ((SUBSECTION | 
NARRATIVE | CODE | CLASS | FILE)*) > 
<!ELEMENT SUBSECTION ((NARRATIVE | 

CODE | CLASS 
| FILE)*) > 


Sections can be further subdivided into 
<SUBSECTION> tags, although the de- 
cision to use subsections tends to de- 
pend on the length of the program you 
are writing. In any case, sections and 
subsections both contain narrative, class- 
es, and code fragments. There is also the 
possibility of including other text files 
that may be required by the program us- 
ing the <FILE> tag. Comb treats these as 
a special case of <CLASS> except that 
the name of a file must be fully speci- 
fied in its NAME attribute—so you 
sometimes see something like this in a 
Marius source file: 


<FILE NAME="Readme.txt'"> 


To install Benchmark simply execute the 
install.exe file. 


</FILE> 


Similar Systems 

Marius isn’t the only Java/XML-based lit- 
erate programming system. The ABC sys- 
tem (presumably named after its author 
Anthony B. Coates and available at http:// 


www.allette.com.au/xml-litprog/) shares 
some similarities to Marius. It’s a literate 
programming project written in Java us- 
ing XML documents as input. However, 
ABC is more ambitious— eventually it will 
be language neutral (you could create C++ 
programs this way) and typesetter inde- 
pendent (you may be able to output 
PostScript or PDF, not just HTML). 


A Marius source 
document keeps 
code fragments, 

classes, and 
explanation 
together in a single 
source document 





Likewise, there is an excellent, language- 
neutral TeX-based system called “Fun- 
nelWeb” (http://www.ross.net/funnelweb/), 
and of course Knuth’s original work rolls 
on, as well as a related C-based version 
called “CWEB.” Also, Perl 5 lets insert 
documentation directly into Perl pro- 


- grams with what are called “PODS.” 


None of these use XML as an input lan- 
guage, of course. They were all done 
before it was widely released. It’s part- 
ly XML’s unrecognized potential for let- 
ting you easily create small languages in 
the AWT sense that I find so interesting 
about it. 


Conclusion 

I have also used Marius source files to 
convince people that certain pieces of 
code contain bugs. Although this is a sim- 
ple process when sitting at the opposite 
desk, we often have to use code that was 
created and is maintained by distant 
groups. Quite often, there’s not even a 
phone number to call and all that can be 
done is to send a working example of the 
problem to a support e-mail address. You 
instantly hit the problem of emphasis. How 
much of the code in an example is pre- 
sent for support purposes, and how much 
of the example code demonstrates the 
problem? Quite often, the broken piece 
of code is one or two lines buried in a 
much larger structure— but without the 
additional code it cannot be demonstrat- 
ed. I’ve found that describing an issue with 
Marius usually brings a response where 
other methods have failed. 

Marius has been kept deliberately sim- 
ple because it was created as something 
of an experiment. Now that it has been 
in use for a little while, there are a vari- 
ety of additional features that have been 
suggested — mainly for Weave. The ex- 
amples that have been produced with it 
have been growing steadily larger and 
are reaching the stage where they are 
becoming small books rather than large 
articles. Future versions of Marius will 
probably have to make concessions to 
this development and allow for tables of 
contents, indexes, and similar elements. 
The other issue that has arisen is that the 
HTML produced has a simple and fixed 
style. It would be nice to be able to 
make changes to the style using style 
sheets or some form of HTML template 
to allow a wider variety of HTML to be 
produced, so that it could be integrated 
with existing web sites. 
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Listing One 


<SECTION TITLE="The Nfib algorithm"> 
<NARRATIVE> 


<P>Nfib is our next example of benchmarking algorithms, it is closely 


related to the Fibonacci series, and gives 


a broad idea of the speed of procedure calls in the interpreter. It 


relies on its rather peculiar property 


that the integer that it returns is the same as the number of procedure 


calls needed to generate the number.</P> 
<P>The Fibonacci series starts 1, 1, 2, 3, 5,... 
of the two before. The <! [CDATA[<B>NFIB</B>] ]> 


algorithm adds 1 at each stage giving the series 1, 1, 


counting from 8, we can see that nfib(3) = 5. 
Here is the code for the method:</P> 
</NARRATIVE> 
<CODE NAME="nfib" EXTENDS="Benchmark" WEIGHT="70"> 
public int nfib(int n) { 
if (n &lt; 2) { 
return 1; 
} else { 
return nfib(n - 1) + nfib(n - 2) + 1; 


} 
</CODE> 
<NARRATIVE> 


3, 5, 9,... Thus 


taken to calculate a relatively large 
<REF PATH ="#nfib">nfib</REF> number (e.g, nfib(37)) by the return 


value. </P> 


<P>The algorithm can be found as part of the benchmark class defined 


</NARRATIVE> 


<CLASS NAME = "Benchmark" 
EXTENDS = "JFrame" 
IMPLEMENTS = "ActionListener" 


here, which includes a user interface:</P> 


PACKAGE = "com.cedillasoft.benchmarks"/> 


each element is the sum 


Listing Two 
<NARRATIVE> 


<P>We are going to use the Java Foundation classes to create the user 
interface so the awt and swing packages need to be imported in the 


Benchmark class.</P> 
</NARRATIVE> 


<CODE NAME="imports" EXTENDS="Benchmark.prelude"> 


import java.awt.*; 
import java.awt.event.*; 
import javax.swing.*; 
</CODE> 


<P>It's quite easy to see that nfib(2) takes exactly three calls to nfib 


to calculate its return value of 3. 


To get a very rough idea of procedure call speed, simply divide the time 
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he Personal Computer/Smart Card In- 
terface (PC/SC) and OpenCard Frame- 
work are two industry initiatives to 
define a standard way to integrate 
smartcards into computer systems. With 
PC/SC, emphasis has been placed on the 
interoperability of smartcards and card ter- 
minals, and on the integration of those 
card terminals into Microsoft Windows. 
(Interoperability for smartcards means that 
one manufacturer’s card or card reader 
can be used with another manufacturer's.) 
Members of the OpenCard (http://www 
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.opencard.org/) and PC/SC (http://www 
.smartcardsys.com/) consortiums realized 
the need for a common framework to sup- 
port smartcards on various platforms (rang- 
ing from network computers to smart 
phones and set-top boxes), to securely 


. . ] hive l Orel ce! 


wx: 


ets F sal SATE Sg 
wive baat SAU RRR 


authenticate users, and to personalize these 
otherwise anonymous devices (similar to 
GSM cell phones that are activated and 
personalized via inserting a smartcard). 
OCF took advantage of some features al- 
ready available within PC/SC and other 
smartcard standards, then focused on two 
new areas— independence from host op- 
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erating systems, and transparent support 
of different multiapplication cards and 
management schemes. 

In this article, we'll discuss OCF, then 
present an OCF-based terminal (reader) 
application. The complete source code for 
this Java applet is available electronically 
(see “Resource Center,” page 7). Although 
the examples and techniques are based 
on Schlumberger’s Cyberflex Open16K 
card family supporting JavaCard 2.0 API, 
OCF is platform independent. For more 
information on JavaCard development, see 
“JavaCard Application Development,” DDJ, 
February 1999. Likewise, Version 1.0 of 
the PC/SC standard is available at http:// 
www.smartcardsys.com/. 


What Is OCF? 

OCF is a high-level interface that provides 
a framework for developing terminal 
(reader) applications for smartcards in 
Java. It is independent of the underlying 
operating system because it is imple- 
mented in Java. OCF supports multi- 
application cards, such as JavaCard. Its 
target platforms are network computers, 
web browsers, and any other platform 
that runs Java and needs to interact with 
smartcards. Java applications running on 
a desktop computer can use OCF to ac- 
cess smartcards. Figure 1 illustrates the 
general OCF architecture: 
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(continued from page 70) 

e The Card Applet is provided by the 
card developer and is running on a 
smartcard. 

e The CardTerminal classes are provid- 
ed by vendors who want to make their 
card terminals available to OCF appli- 
cations. A CardTerminal class encap- 
sulates the card-terminal behavior and 
a CardTerminalFactory class. The fac- 
tory is used by OCF to create CardTer- 
minal instances when the Framework 
is initialized. The card-terminal facto- 
ry of each card terminal attached to 
the desktop computer has to be regis- 
tered with the CardTerminalRegistry. 

e OCF provides methods and classes for 
CardService to access the card (Card- 
Channel, CardServiceScheduler, and 
the like). Because there may be more 
than one instance of card services per 
card, the CardServiceScheduler serial- 
izes the access of different services to 
the CardChannel, a communication 
link to the card that is represented by 
the CardID. 

The CardService classes implement a 

standard API, thus hiding the smart- 

card specifics. It generates application 
protocol data units (APDUs) and com- 
municates with the card to support 
high-level API functions. A Card- 

ServiceFactory is associated with each 

CardService implementation and is ca- 

pable of constructing it. The CardSer- 

viceFactory identifies the card or cards 
for which the CardService was de- 
signed. When a smartcard is inserted 
into the reader, OCF goes through its 
list of registered card-service factories 

(within the card-service registry) and 

instantiates card services correspond- 

ing to the card. 

Currently, OCF defines a few card- 

service interfaces (FileAccessCardSer- 

vice or AppManagementCardService, 
for instance) to make the common 
smartcard functions available to you. 

Smartcard manufacturers are supposed 

to provide implementations of these 

classes and the corresponding facto- 
ry classes, but at this time they are 
generally not available. 

e The terminal application is written by 
you. All you need to know is the API 
provided by the CardService. 


Development Environment 
To develop an application using JavaCard 
and OCF, you need: 


¢ JDK 1.1. or higher). 

e A smartcard conforming to the JavaCard 
2.0 (or higher) API specification. 

e An applet loader to load applets onto 
the card. 

e A card terminal supported by OCF. 
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e The OCF build tree (currently Version 1.1). 
e OCF CardTerminail classes for the card 
terminal. 


Most card vendors provide an SDK that 
supports features for managing applets 
and communication with the card. For ex- 


OCF is a 
high-level interface 
that provides 
a framework for 
smartcards in Java 





ample, the applet loader provided in the 
Cyberflex Open16K development kit needs 
PC/SC, but you can use it with a pure 
Java applet loader. However, if you use 
a Java loader based on the Java Com- 
munications API, you need to remove 
PC/SC because it captures the necessary 
communication port. 





Figure 1: OpenCard framework. 
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soning the 
Framewor 

In OCF 1.1, the code making up the 
framework is organized into components 
(base-core.jar, base-opt.jar, and the like). 
Each component is made up of one or 
more Java packages that provide a well- 
defined functionality for you. 

Depending on the card terminal you 
use, you should install the CardTerminal 
classes. We use the Litronic 210 1.0b OCF 
CardTerminal driver that uses the Java 
Communication API Gavax.comm 2.0) to 
access the device connected to a serial 
port. The Java Communications API can 
be used to write platform-independent 
communications applications for tech- 
nologies such as voice mail, fax, and 
smartcards. 

OCF obtains some configuration in- 
formation via the Java system properties. 
The Java system properties are platform- 
independent mechanisms to make oper- 
ating system and run-time environment 
variables available to programs. OCF pro- 
vides a utility class to load properties from 
a file called “opencard.properties.” 

You can configure the card terminal reg- 
istry via the OpenCard.terminals proper- 
ty, and the card service registry via the 
OpenCard.services property. Listing One 
is the opencard.property file for the 
SimpleString card service. 


Card Applet 

The SimpleString applet lets you write a 
string to the card (the setString() 
method), and read it back from the card 
(getString() method). The card applet 
class must store strings as byte arrays, 
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(continued from page 72) 
but the terminal application can easily 
handle type translation to store strings. 
The setString() method stores a string on 
the card; see Listing Two. The setncoming- 
AndReceive() method in the APDU class is 
used to both set the JavaCard Runtime En- 
vironment (JCRE) to receive mode and to 
receive any available data into the APDU 
buffer. The command data will not be in 
the APDU buffer until it is read by the ap- 
plet calling setIncomingAndkReceive() or re- 
ceiveBytes(). The second method can be 
called only after setIncomingAndReceive() 
when there is more data than can fit into 
the APDU buffer. The APDU buffer has a 
minimum size of 37 bytes, and the maxi- 
mum size is determined by you (255 bytes 
on Cyberflex Open16K). 


The getString() method retrieves the 
string from the card (see Listing Three). 
To access the data in the APDU buffer, 
the applet must retrieve a reference to the 
APDU data buffer by calling the APDU 
getBuffer() method. 

When a client application asks for the 
string, it has no way of knowing how long 
the string really is. We handle this in the 
following way: 


1. The client sends a getString APDU with 
the length field set at 0. 

2. The card responds with a status word 
set at Ox50yy, where yy is the string 
length (hex). 

3. The client sends the getString APDU 
again, but this time with the length field 
set at yy (string length). 
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The response APDU may or may not 
contain data. If the response doesn’t con- 
tain data, the applet need not do anything 
but return. 

To set the JCRE mode to send, the 
APDU method setOutgoing() is called (see 
Listing Three). This method returns the 
number of response bytes expected by 
the client application. This number corre- 
sponds to the number of bytes expected 
by the command APDU (in most cases 
the Le byte) to which the applet is re- 
sponding. 

The setOutgoingLength() method in- 
forms the JCRE about the number of bytes 
the applet will be sending. The send- 
Bytes(_) method sends a specified number 
of response bytes from the APDU buffer. 

The APDU class contains the setOut- 
goingAndSend() method combining the 
three methods described in this section. 
Using this method, the data are actually 
not sent until the applet returns from the 
process() method, at which time the data 
are combined with the status bytes. So, 
once setOutgoingAndSend( ) is called, the 
applet cannot alter the APDU buffer un- 
til process() returns. 

To compile the applet, you can use a 
standard Java compiler (javac, for ex- 
ample). For example, the Cyberflex post- 
processor mksolo requires using javac 
with the debugging option “-g.” Before 
you can load the applet to the card, it 
must be converted (that is, previously 
compiled with mksolo) to a form that 
the JavaCard can understand. This is 
done by the off-card part of the Java- 
Card Virtual Machine. Now you can load, 
install, and register the applet using a 
proprietary card loader provided by your 
card SDK. 


Card Service 

Once the applet is properly loaded, in- 
stalled, and registered on the card, you 
need to provide a corresponding termi- 
nal application for accessing the card. In 
other words, a special card service that 
supports the interfaces of the card applet 
is necessary. 

At this point, we'll introduce several 
OCF features by a demo card service for 
the SimpleString applet. The card service 
is not suitable for the multiapplica- 
tion/multiterminal scenario; it only sup- 
ports one terminal application accessing 
one card applet at a time. 

A card service is instantiated by the cor- 
responding card-service factory registered 
with OCF. The instantiation is performed 
in two steps. First, the default constructor 
is called, then the initialize() method is 
called, which we override. In this case, 
we must call the superinitialize() method. 

The card service instance provides the 
methods for application use. We'll provide 
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two methods corresponding to the two 
functions of the card: the writeString() 
and readString() methods. To communi- 
cate with the card, the Framework’s Card- 
Channel class is used. It must be allocat- 
ed (allocateCardChannel() before use, 
and deallocated (releaseCardChannel( )) 
after use. The allocateCardChannel() 
method makes sure that a card channel 
is available for communication with the 
card. This method blocks other threads 
until the active thread releases the chan- 
nel. The card applet must be selected 
before it can accept APDU commands 
(see Listing Four). If the AppMgmnt- 
CardService class (see Figure 1) was pro- 
vided by Cyberflex, methods for select- 
ing applets would already be provided 
(Schlumberger has announced planned 
support for the service). 

We also have to define a method for send- 
ing APDUs to the card (see Listing Five). 
The getCardChannel() method returns a 
reference to a CardChannel object that can 
be used for communication with the card. 
The CardChannel offers methods for send- 
ing commands and receiving responses. 


Card-Service Factory 
To create a card-service instance, OCF 
needs a card-service factory. Each smart- 
card has a unique identifier that is returned 
by the card in the Answer To Reset (ATR) 
message when it is powered up (in our 
case it is cyberFlexATRI[/). CardID is an 
OCF class for handling unique smartcard 
identifiers that are returned in an ATR. 
Our SimpleStringCardServiceFactory class 
can only create instances of Simple- 
StringCardService (see Listing Six). 
Now we are ready to register our ser- 
vice with OCF (see Listing Seven). Addi- 
tionally, all card-service factories must 
override two abstract methods that are 
used by OCF to instantiate services — 
knows() and cardServiceClasses( ). The 
knows() method indicates whether this 
factory knows the smartcard operating sys- 
tem (cardID). If it does, the factory is able 
to instantiate card services for the card. 
The card-service factory of each service 
supported by OCF has to be registered 
with the card-service registry. To register 
the factory, add OpenCard.services = sam- 
ples.simplestring.SimpleStringCard- 
ServiceFactory to the opencard.proper- 
ties file. 


Terminal Application 

The SimpleStringDemo terminal applica- 
tion (available electronically; see “Resource 
Center,” page 7) writes a string to the card 
and reads it back. Importing the Sim- 
pleStringCardService \ets you use the high- 
level interface previously defined. At the 
beginning we have to initialize OCF. OCF 
offers two approaches to find out whether 
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a card supporting a particular service is 
inserted in the card terminal. The first one 
works with event notification, an event 
being card insertion into or card removal 
from the terminal. The second approach 
uses the SmartCard.waitForCard() 
method, which delays the program exe- 
cution until a card supporting specified 
service is inserted into the card terminal. 
In our terminal application, OCF waits for 
a card supporting the SimpleString card 
service (see Listing Nine). Now we can in- 
stantiate the SimpleStringCardService and 
start using the card-service methods 
(writeString() and readString()). 


Conclusion 
JavaCard and OCF provide a develop- 
ment environment that makes it possible 


en Tomy _ 





for you to create platform-independent 
smartcard-based applications. Because 
the concept is relatively new (some parts 
are still under development) and rapid- 
ly changing (for example, the latest ver- 
sion of the JavaCard API is 2.1, and our 
examples are written for 2.0), it is not 
yet quite suitable for developing indus- 
try- or commercial-strength applications. 
However, these two concepts — togeth- 
er with the Sun Microsystem’s JavaCom- 
merce platform— are promising to de- 
velop into a pure Java development 
environment for e-commerce applications. 
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Listing One ( 


// Set up the command APDU 


OpenCard.services = samples.simplestring.SimpleStringCardServiceFactory CommandAPDU commandAPDU = new CommandAPDU(apdu) ; 
OpenCard.terminals = dk.itplus.smartcard.terminal.litronic210. ResponseAPDU responseAPDU=getCardChannel () .sendCommandAPDU (commandAPDU) ; 
LitronicCardTerminalFactory|Litronic219;|Litronic21@;COM1 return (responseAPDU) ; 
e e } 
Listing Two ee 
private void setString(APDU apdu) Listing Six 
{ public SimpleStringCardServiceFactory() 
buffer = apdu.getBuffer () ; { 
// receive data from terminal try { 
byte size = (byte) (apdu.setIncomingAndReceive()) ; cyberFlexCID = new CardID (cyberFlexATR) ; 
byte index; } catch ( Exception e ) { 
TheBuffer[@] = size; } 
// store string and its length } 
Util.arrayCopy(buffer, ISO.OFFSET_CDATA, area ; 
short)1, (short) size) ; ich 
oe Listing Seven 
} // register card service with OCF 
static { 
Listing Three services_.addElement (SimpleStringCardService.class) ; 
} 
private void getString(APDU apdu) { 
buffer = apdu.getBuffer(); He H 
byte numBytes buffer [ISO.OFFSET_LC] ; Listing Eight 
if (numBytes == Q) protected Enumeration cardServiceClasses(CardID cid) 
ISOException.throwIt ( (short) (SW_WRONG_LENGTH + TheBuffer [@] )); 
apdu.setOutgoing(); return services_.elements(); 
apdu. setOutgoingLength (numBytes) ; } 
Util. arrayCopy(TheBuffer, (short)1,buffer, (short)@, (short) numBytes) ; public boolean knows(CardID cardID) 
apdu. sendBytes((short)@, (short)numBytes) ; { 
return; // check whether the factory knows the smartcard OS 
} if (cardID.equals(cyberFlexCID) ) 
® e ( 
Listing Four return true; 
} else 
public void selectApplet() throws CardServiceException { 
{ return false; 
try } 
t } 
allocateCardChannel () ; 
sendAPDU(selectRoot) ; oaks MH 
sendAPDU(selectApp) ; Listing Nine 
} catch(Exception e) // initialize OCF 
{ SmartCard.start(); 
e.printStackTrace() ; CardRequest cr = new CardRequest (SimpleStringCardService.class) ; 
throw new CardServiceException() ; // wait for card supporting SimpleStringCardService 
} finally SmartCard sc = SmartCard.waitForCard(cr); 
{ // instantiate card service 
releaseCardChannel () ; SimpleStringCardService ssp = (SimpleStringCardService) sc. 
} getCardService(SimpleStringCardService.class, true); 
appletSelected = true; 
return; 
e e e 
Listing Five 
private ResponseAPDU sendAPDU(byte[] apdu) throws CardTerminalException DDJ 
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Bringing Java's benefits 
to real-time developers 





David Hardin 


ava technologies, including the Java 

programming language and the Java 

Virtual Machine (JVM), have achieved 

broad acceptance in the developer 
community since their introduction in 
1995. Java’s simplified object model, strong 
notions of safety and security, integral mul- 
tithreading support, and promise of Write 
Once, Run Anywhere (WORA) have much 
to offer real-time and embedded devel- 
opers. However, the large size, nondeter- 
ministic behavior and poor performance 
of most Java implementations have ham- 
pered the acceptance of Java in the real- 
time and embedded communities. 

The recently announced Real-Time Spec- 
ification for Java (RTSJ) promises to address 
these problem areas and bring the bene- 
fits of Java to real-time software develop- 
ers. In this article, I’ll examine the re- 
quirements and design decisions that led 
to the RTSJ, as well as provide practical ex- 
amples of its use. 


David is chief technical officer for ajile Sys- 


tems and can be contacted at david. hardin@ 
ajile.com. 
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Specification for Java 


Real-Time Java Requirements 

Recognizing Java’s potential in the real- 
time space, a group of real-time and Java 
experts convened in a series of workshops 
sponsored by the National Institute of Stan- 
dards and Technology (NIST) beginning in 





June 1998 to develop requirements for real- 
time Java. These requirements were in- 
tended to drive the development of a real- 
time Java specification, but would not 
mandate a particular set of APIs or semantic 
extensions to the Java language; this would 
be left to the specification phase. 

The NIST workshops drew participation 
from well over 50 organizations. The work- 
shops were informed by the early imple- 
mentation experience of pioneers in this 
field, notably the PERC virtual machine by 
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NewMonics (see “Issues in the Design and 
Implementation of Real-Time Java,” http:// 
www.newmonics.com/pdf/RTJI.pdf/; “Java 
and Embedded Real-Time Control,” Dr. 
Dobb's Java Sourcebook, 1996; and “pico- 
PERC: A Small-Footprint Dialect of Java,” 
DDJ March 1998, all by Kelvin Nilsen) and 
the JEM family of direct-execution Java mi- 
croprocessors from Rockwell Collins (see 
“The Rockwell JEM Microprocessor Family: 
An Efficient Platform for Real-Time Embed- 
ded Java,” by David Hardin, Proceedings of 
the 1998 IEEE Workshop on Programming 
Languages for Real-Time Industrial Applica- 
tions, December 1998; http://www.ajile.com/ 
people/hardin/writings/jem-plrtia-1998 html). 
Additionally, academic real-time re- 
searchers and members of the Ada com- 
munity provided guidance in the provision 
of real-time capabilities in a programming 
language context. Most usefully, end-user 
organizations provided guidance as to the 
most important requirements for real-time 
Java. The NIST workshops produced nine 
core requirements for a Real-Time Java 
specification, as well as a number of de- 
rived requirements. The requirements are 
documented in “Requirements for Real-Time 
Extensions for the Java Platform,” edited by 
Lisa Carnahan and Marcus Ruark (NIST, 
September 1999; http://www.nist.gov/rt- 
java/). 


Development of the 

Real-Time Specification for Java 

Prior to the real-time specification, Java spec- 
ifications were jointly developed by Sun 
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Microsystems and its Java source-code li- 
censees. Responding to pressures to pro- 
vide broader participation in Java standards 
development, Sun instituted the Java Com- 
munity Process in December 1998 (http:// 
java.sun.com/aboutJava/communityprocess/ 
java_community_process.html). Under the 
terms of the Java Community Process, any- 
one can institute a Java Specification Re- 
quest (JSR). If a specification request in a 
_ particular field is accepted by Sun, then a 
Call for Experts (CAFE) in that field is is- 
sued. Sun then selects a specification lead 
from amongst the CAFE nominees; the spec- 
ification lead is responsible for convening 
an expert group to author the specification, 
as well as developing a reference imple- 
mentation and conformance tests. 

The RTSJ effort was launched with the 
first Java Specification Request, JSR-000001. 
The Real Time for Java Expert Group (RT- 
JEG) convened in March 1999. It consisted 
of representatives from 21 organizations in 
industry, academia, and government, work- 
ing under the leadership of IBM’s Greg Bol- 
lella. To reach a specification more quick- 
ly, the group was divided into a primary 
team of eight engineers (including James 
Gosling of Sun, the inventor of Java, as well 
as the author) and a consulting team. The 
primary team was tasked with developing 
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the draft specification, which the consult- 
ing team would critique and refine. 

The draft RTSJ was made available for 
participant review, as well as public view, 
on September 27, 1999 (public review fol- 
lowed the participant review period, be- 
ginning in December, 1999). 


Guiding Principles 

Given that there were many design alter- 
natives for real-time Java that would meet 
the NIST requirements, the first task of the 
RTJEG was to produce a set of guiding 
principles for the design. These principles 
are enumerated in the current RTSJ draft 
(http://www. rtj.org/rtj.pdf) as follows: 


e Applicability to particular Java environ- 
ments. The RTSJ shall not include spec- 
ifications that restrict its use to particu- 
lar Java environments. 

e Backward compatibility. The RTSJ shall 
not prevent existing, properly written, 
nonreal-time Java programs from exe- 
cuting on implementations of the RTSJ. 

e Write once, run anywhere. The RTSJ 
should recognize the importance of 
WORA, but must also recognize the dif- 
ficulty of achieving WORA for real-time 
programs. 

e Current practice versus advanced fea- 
tures. The RTSJ should address current 
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real-time system practice as well as al- 
low for the incorporation of more ad- 
vanced features in the future. 

e Predictable execution. The RTSJ shall 
hold predictable execution as first prior- 
ity in all tradeoffs; this may sometimes be 
at the expense of typical general-purpose 
computing performance measures. 

e No syntactic extension. To facilitate the 
job of tool developers, and thus to in- 
crease the likelihood of timely imple- 
mentations, the RTS] shall not introduce 
new keywords or make other syntactic 
extensions to the Java language. 


The Design of javax.realtime 

Unlike most Java specifications that 
merely define new APIs, the real-time 
specification provides modifications to 
the Java Language Specification and the 
JVM Specification, as well as new APIs 
to enable the creation, verification, ana- 
lysis, execution, and management of 
real-time Java threads. The new APIs re- 
side in a new standard extension pack- 
age, javax.realtime. 


Class Architecture 

A fundamental design decision facing 
the RTJEG was whether classes in 
javax.realtime should inherit from stan- 
dard Java classes (for example, whether 
a real-time thread should be a subclass 
of java.lang.Thread or a new class that 
parallels java.lang.Thread in the class hier- 
archy). A completely parallel hierarchy 
would, in some sense, be simpler, but 
would render large parts of the Java en- 
vironment essentially inaccessible to real- 
time developers. Another problem with 
the parallel class hierarchy approach is 
that Java includes a number of features, 
such as threads and the “synchronized” 
keyword, that would be useful to real- 
time developers if the underlying seman- 
tics were strengthened. The RTJEG de- 
cided on a more integrated approach and 
took on the detailed design of the RTSJ 
as a series of modifications of the stan- 
dard Java environment. Real-Time Java ap- 
plications would need a special JVM on 
which to execute, but could use many of 
the features of the standard Java pro- 
gramming model. 

Figure 1 is the class hierarchy for 
javax.realtime. As you can see, classes in 
javax.realtime descend from standard Java 
classes java.lang.Object, java.lang.Thread, 
and java.lang.Throwable. javax.realtime pro- 
vides two new interface classes, Interrupt- 
ible and Schedulable. 


Detailed Design 

The RTJEG identified seven areas of the 
standard Java environment for modifi- 
cation. These are summarized in the 
RTSJ as: 
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¢ Thread scheduling and dispatching. The 
RTSJ allows the programmatic assign- 
ment of parameters appropriate for the 
underlying scheduling mechanism in use 
in a given real-time system, as well as 
providing methods for the creation, 
management, admittance, and termina- 
tion of real-time Java threads. Real-time 
Java threads are constructed as instances 
of class javax.realtime.RealtimeThread, 
which extends java.lang.Thread. While 
the RTJEG expects that, for the near 
term, particular thread scheduling and 
dispatching mechanisms will be bound 
to an implementation, the RTSJ also pro- 
vides enough flexibility in the thread 
scheduling framework to allow imple- 
mentations to provide scheduling poli- 
cies unanticipated by the specification 
and allow future specifications to build 
on the RTSJ for the dynamic loading of 
scheduling policy modules. The RTS] base 
scheduling mechanism is preemptive 
priority-based, FIFO within priority, with 
at least 28 unique priority levels. 

Memory management. Automatic mem- 
ory management (garbage collection or 
GC) is a particularly important feature 
of the Java programming environment, 
thus, the RTJEG sought to isolate pro- 
grammers from memory management 
concerns as much as possible. Howev- 


er, the group also recognized that this 
was no silver bullet amongst existing 
real-time GC algorithms. To accommo- 
date a diverse set of preemptible GC al- 
gorithms, the RTS] defines a memory al- 
location and reclamation specification 


The RTSJ 
provides 
modifications to the 
Java Language 
Specification and 
the Java VM 
Specification 





that is independent of any particular GC 
algorithm and lets the program precisely 
characterize the GC algorithm’s effect 
on the execution time, preemption, and 
dispatching of real-time Java threads. To 
this end, the RTS] defines new types of 





memory areas, ImmortalMemory and 
ScopedMemory, that allow the creation 
of Java objects but do not cause the 
threads that employ them to incur de- 
lays because of the execution of the GC 
algorithm. 

Synchronization and resource sharing. 
The RTJEG determined that the least in- 
trusive specification for allowing real- 
time safe synchronization is to require 
that implementations of the current Java 
keyword “synchronized” include one or 
more algorithms that either prevent or 
bound execution eligibility inversion 
among real-time Java threads that share 
the serialized resource. Priority inheri- 
tance is provided by default, with op- 
tional priority ceiling emulation. 


e Asynchronous event handling. The RTSJ 


generalizes the Java language’s notion 
of asynchronous event handling. An 
AsyncEvent is an object that is pro- 
grammatically bound to an AsyncEvent- 
Handler. The AsyncEventHandler class 
implements Runnable, and the overrid- 
den run() method is executed when 
the AsyncEvent is triggered. Handlers 
execute with the semantics of a real- 
time thread (although the RTSJ does not 
require that handlers be implemented 
as threads, only that they execute as 
though they were). 
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Figure 1: i a) Class hierarchy: (b) interface hierarchy. 


e Asynchronous transfer of control. The 
RTSJ specifies that methods that have a 
“throws” clause including Asyn- 
chronouslyInterruptedException (AIE) 
in their signature will have that excep- 
tion raised by the JVM when the inter- 
rupt() method for their thread is called. 
This mechanism extends the current se- 
mantics of the interrupt() method from 
only certain blocking calls to straight- 
line code. The Timed class extends 
AsynchronouslylnterruptedException 
and, when constructed with a time pa- 
rameter, will cause an AJE to be posted 
to the thread at the expiration time. If 
the thread is executing in a method that 
throws AJE, the exception will be thrown 
immediately. If not, the exception is said 
to be pending and will be thrown when 
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the thread next reaches a method that 
throws AJE. 


e Thread termination. Although the RTSJ 


does not define a general, arbitrarily in- 
vokable thread termination mechanism, 
the programmer can effectively termi- 
nate a thread by using the asynchronous 
event handling and asynchronous trans- 
fer of control mechanisms. A happening 
external to the JVM can be bound to an 
AsyncEvent that, when triggered, exe- 
cutes the associated AsyncEventHandler. 
The handler can then execute Thread.in- 
terrupt() for the target thread and, giv- 
en adherence to a programming style, 
the target thread will unwind and com- 
plete its run() method. 

Physical memory access. Although not di- 
rectly a real-time issue, physical memory 
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access is desirable for many of the appli- 
cations that could productively make use 
of an implementation of the RTSJ. The 
RTSJ thus defines a class that allows pro- 
grammers byte-level access to physical 
memory as well as a class that allows the 
construction of Java objects at particular 
address locations in physical memory. 


Example Uses of the RTSJ 

I'll now demonstrate the use of the RTS] 
in a number of practical examples. Many 
of these examples provide explicit access 
to underlying hardware and would be re- 
jected by the Security Manager if down- 
loaded from an untrusted source. Also, 
these examples will not run correctly on 
a standard JVM. 

Listing One illustrates the setup and use 
of an asynchronous event handler— in this 
case, an event tied to a hardware interrupt. 
The handler can be conceptually viewed 
as a thread that waits on a notification of 
the occurrence of the hardware event. 

Listing Two demonstrates the use of Jm- 
mortalMemory as an alternative to standard 
garbage-collected heap allocation of objects. 
Even though garbage collection is a pre- 
emptible operation on JVMs that support 
the RTSJ, this preemption time may be too 
long for many applications. Scoped and Im- 
mortal memory areas do not require garbage 
collection, and thus can be freely accessed 
in the context of a NoHeapRealtimeThread 
that runs at higher eligibility than the col- 
lector. Listing Two also illustrates the use of 
scheduling parameters to achieve periodic 
scheduling of real-time threads. 

Listing Three shows the use of asyn- 
chronous transfer of control from an op- 
eration that can be timed out. The use of 
an. AsynchronouslylnterruptedException al- 
lows cleanup to occur quite naturally from 
the programmer’s perspective and is much 
safer than, say, setimp( /longjmp(). 

Listing Four gives an example of the 
use of RawMemoryAccess to write device- 
level code in Java—in this case, an in- 
terface to the Intel 8253 Programmable 
Interval Timer/Counter chip. 


RTSJ Status and Direction 

By the time you read this article, public re- 
view of the RTSJ should be concluding. The 
Reference Implementation will then follow, 
as well as a conformance test suite, as spec- 
ified by the Java Community Process. With- 
in a year, implementations of the RTSJ will 
no doubt be available from a number of 
vendors. Most implementations of the RTSJ 
will be hosted on traditional Real-Time Op- 
erating Systems (RTOS) and many will fo- 
cus on embedded device targets. Howev- 
er, enterprise-level implementations, as well 
as implementations on platforms such as 
Real-Time Linux, are also likely. Finally, di- 
rect support for the RTSJ in silicon will be 
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forthcoming from my company, ajile Sys- 
tems, based on the JEM designs developed 
by the aJile team while at Rockwell Collins. 
Given the network-oriented nature of 
much current Java development, a signifi- 
cant future specification effort lies ahead in 
the area of Distributed Real-Time. Expect 
to see an expert group to convene in this 
area soon. 


Conclusion 

A Java environment augmented with real- 
time capabilities provides an attractive 
object-oriented language environment for 
real-time developers, and a predictable, 
responsive platform for Java developers. 
The RTSJ documents a limited set of mod- 
ifications to the Java Language Specifica- 
tion and JVM Specification, as well as a 
set of APIs, to provide a predictable plat- 


form for the execution of Java code. The 
development of the RTSJ also demonstrates 
that Sun Java licensees and nonlicensees can 
cooperate on an equal footing under the 
Java Community Process to develop a tech- 
nically difficult specification on an aggres- 
sive schedule. 
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Listing One 


import javax.realtime.*; 

/** Example of using Asynchronous Event/Event Handling facility to provide an 
interface to hardware events, i.e. interrupts. A hardware interrupt 

oo. fires an AsyncEvent, which causes the associated handler to run. 
* 


public class HardwareEventExample extends AsyncEvent { 
private int interruptNum; 
/** Construct a new Hardware Event for a given interrupt. 
oo num Interrupt number 
* 
public HardwareEventExample(int num) { 
interruptNum = num; 


/** Bind a handler to the interrupt. 

* @param h Handler for the interrupt 

*/ 
public void setHandler(AsyncEventHandler h) { 
super.setHandler (h) ; 
Hardware.bind(interruptNum, h) ; 


} 

} 

class HardwareEventHandler extends AsyncEventHandler 
private int interruptCount = @; 
/** Interrupt handler method. */ 
public void handleAsyncEvent() { 

interruptCounttt; 

// Driver code follows 
} 

J 

e e 
Listing Two 


import javax.realtime.*; 
/** Example of the use of "Immortal" memory in a periodic processing context. 


This example eschews heap allocation, thus avoiding garbage collection overhead. 


*/ 
public class ImmortalMemoryExample { 
public static void main(String[] Args) { 
NoHeapRealtimeThread t = null; 


// Set up periodic processing of 1 msec each 1 msec 
PeriodicParameters timeParams = new PeriodicParameters(); 
timeParams.cost = new RelativeTime(1, @); // 1 msec computation 
timeParams.period = new RelativeTime(10, @); // 10 msec period 
// Set up immortal memory; size is given in RealtimeSystem. 
MemoryParameters memParams = new 

MemoryParameters (ImmortalMemory.instance()); 

// Processing is encapsulated in a Runnable 
Runner r = new Runner(); 
// Create a NoHeapRealtimeThread with Periodic scheduling parameters 
// and ImmortalMemory memory parameters. 
try { 

t = new NoHeapRealtimeThread(timeParams, memParams, r); 
} catch (AdmissionControlException e) {} 


// Start processing 
t.etart(); 
} 
} 
/** Perform periodic processing in Immortal memory */ 
class Runner implements Runnable { 
public void run() { 
// Processing code here 
} 


~ 


Listing Three 


import javax.realtime.*; 
/** Example of the use of an AsynchronouslyInterruptedException (actually, the 
* Timed subclass of AIE) to timeout a long-running computation, and return 


int refinedAnswer = Q; 
T.enable(); 

// hairy computation goes here 
T.disable(); 

return refinedAnswer ; 

} 

/** Public interface to the potentially timed-out computation. If a 
* timeout occurs while computing the "refined" answer, a "rough" 
* answer is returned instead. 

*/ 
public int computeAnswer() { 
int roughAnswer = 3; 

// Set up 100 usec timeout 

MyTimed T = new MyTimed((new RelativeTime(2, 100000))); 

try { 

return computeRefinedAnswer (T) ; 

} catch (MyTimed t) { // Computation timed out 

return roughAnswer ; 

} 

} 


Listing Four 


import javax.realtime.*; 
/** Example use of RawMemoryAccess (actually, a subclass of RawMemoryAccess 


* for accessing the Intel x86 I/0 space) to directly address memory. This 
* (elided) example provides an interface to the Intel 8253 Programmable 
* Interval Timer, based on code originally developed by Gerald H. Hilderink. 


*/ 


public class I0Access extends RawMemoryAccess { /* ... */ } 
public class 18253Example { 

/* .., 

private long counter@Offset = @; 

private long counter10ffset = 1; 

private long counter20ffset = 2; 


private long controlWordOffset 


private byte controlWord® = Ox0@; 
private byte controlWord1 = @x@Q; 
private byte controlWord2 = @x@0; 


/** Create instance of the 18253 class. 
* @param baseAddr base address 
*/ 
IOAccess iox; 
public 18253Example(long baseAddr) { 
iox = I0Access.create(baseAddr, (long)8); 


} 
/* .., ¥/ 
/** Write a 16-bit value to counter 2. 
* @param value value for counter 2 
*/ 
public void setCounter2(short value) { 
setControlWord(controlWord2) ; 
iox.setByte(counter20ffset, (byte) (value & OxFF)); 
iox.setByte(counter20ffset, (byte) (value >> 8)); 
} 
/** Write a byte value to the control register. 
* @param value value for control register 
*/ 
public void setControlWord(byte value) { 
iox.setByte(controlWordOffset, value) ; 
J 


/* ... */ 
/** Read a 16-bit value from counter 2. 
i eiaaaia value of counter 2 
* 
public short getCounter2() { 
short value; 
setControlWord(COUNTER2) ; 
value = (short) iox.getByte(counter20ffset) ; 
value |= ((short)iox.getByte(counter20ffset) << 8); 
return value; 


* a "rough" answer instead. Use of AIE avoids overhead of a polling approach. } 
*/ } 
public class AIEExample { 
/** Timeout class */ 
private class MyTimed extends Timed { 
public MyTimed(HiResTime timeout) { 
super (timeout) ; 
} 
/** Long-running computation that may be timed out. 
* @param T Timeout 
*/ 
int computeRefinedAnswer (MyTimed T) 
throws MyTimed { DDJ 
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any organizations are now pro- 
viding remote users with online, 
web-based information. In cor- 
porations, this information ranges 
from human-resource policy manuals to 
data sheets. Libraries, on the other hand, 
offer users access to third-party online 
web-based electronic journals (such as 
the Journal of Mathematical Computa- 
tion, http://www.jstor.org/journals/ 
00255718.html) and databases (like those 
provided by the Online Computer Library 
Center; http://bart.prod.oclc.org/). Most 
commercial information vendors require 
that clients access their web servers from a 
valid IP address, which, at the Library of 
the University of Calgary, means a univer- 
sity IP address or campus-wide user 
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ID/password. This is fine when users are 
on the campus, because on-campus ma- 
chines usually have valid IP addresses. 
However, more and more users— distance- 
learning students, retired professors, staff 


members, and the like— are using their 
own off-campus ISPs to connect to the 
Internet. Users may also want to use a 
public workstation in a public library to 
access a service. However, when legiti- 
mate users of the university library con- 
nect directly to the Internet from an off- 
campus IP address, the vendor web server 
typically rejects the access request. 
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To address this problem, I designed and 
implemented webrelay— a freely avail- 
able multithreaded HTTP relay server. (The 
source code and related files for webrelay 
are available electronically, see “Resource 
Center,” page 7.) In a nutshell, webrelay 
authenticates a client to make sure the 
client is a legitimate user before connect- 
ing it to the vendor web server. The ven- 
dor’s server then sees the request as com- 
ing from the relay server itself, which 
always has a valid IP address or campus- 
wide user identification. 


Design Considerations 

One of my design goals for webrelay is that 
it needed to be as transparent as possible 
to both end users and library administra- 
tors. This precluded use of conventional 
HTTP proxy servers. Experience shows that 
with conventional HTTP proxy servers, end 
users must configure their browsers to use 
that specific proxy. If a user’s ISP already 
has a proxy, it is difficult for the user to set 
up the browser to use the proxy designat- 
ed by, say, the university. Furthermore, 
when browsing a web server other than 
those of specific third-party vendors, users 
have to turn that proxy off to avoid un- 
necessary user authentication imposed by 
the proxy. This is because the library has 
no easy way to restrict proxying to only 
those vendor web servers that the library 
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(continued from page 86) 
has subscribed to with a conventional 
proxy server. 

Webrelay is designed to mirror what- 
ever remote web servers you want to in- 
clude. Users do not have to configure their 
browsers in any special way, because 
users will not see the remote web server. 
To the user, the webrelay server is the real 
target server. The administrator of the li- 
brary has complete control over what ser- 
vices are included in webrelay and 
whether authentication is mandatory for 
a given web server. 

When webrelay mirrors a set of remote 
web servers, it maps the URL of a remote 
web server to a virtual directory of the web- 
relay with the form of hitp://webrelay.host 
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Figure 1: Flowchart of the HTTP relay server. 


name:por/DB=db_key/, where the value 
of db_key is an abbreviation of the real URL 
that a vendor advertises to patrons. The 
DB=db_key is a virtual directory, because 
there is no such physical directory on the 
host of the webrelay. The mapping and a 
corresponding mandatory-authentication 
flag can be defined in a configuration file 
by the administrator. Users are introduced 
to these virtual directories by hyperlinks 
embedded in the top homepage of the li- 
brary, which is under control of the library 
administrators. If a user comes in from an 
off-campus IP address or the virtual di- 
rectory has its mandatory-authentication 
flag set to True, the request is channeled 
to the User Validation Engine (UVE); see 
Figure 1. | | 
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Another design consideration involves 
how you establish a session and maintain 
session data in a basically stateless HTTP 
protocol. One option is to use Netscape 
cookies, but in our case this wasn’t a good 
mechanism since cookies are designed ex- 
plicitly for user tracking. When users ac- 
cess those services from a public work- 
station in a public library, it is difficult to 
manage cookies for individual users, be- 
cause other users may have used that 
workstation at different times. The web- 
relay server would have to manage a set 
of cookies for communication with the 
end user, as well as another set or sets of 
cookies that might have been issued by a 
remote web server. | 

With this in mind, I decided to take a 
very different approach. After users are 
successfully authenticated, they are as- 
signed a unique session key and regis- 
tered with the Session Control Engine 
(SCE). As the user browses through a web 
site, the SCE tracks the update time, 
records any cookies that are sent by the 
remote web server, and manages any oth- 
er pertinent session data. 

After a new session is established, I use 
the session key to replace the db_key. The 
virtual directory now consists of the ses- 
sion key and possibly a hostname of an- 
other web server that the vendor may se- 
lect. From then on every embedded URL 
in any page downloaded by users must 
be converted to have its base point to the 
webrelay hostname and port number, plus 
the virtual directory. This is done on-the- 
fly by a Relay and URL Conversion En- 
gine (RURLCE) before the page can be 
sent to the user. This ensures that subse- 
quent requests always have the correct 
session key included. Furthermore, with- 
in the session all the requests will be 
forced through webrelay. 

The other related aspect of the design 
is how to efficiently handle multiple con- 
nections, while at the same time avoiding 
relying on any interprocess communica- 
tion means for sharing the session data. I 
chose to use multiple threads to handle 
separate connections. Different threads 
can share the session data in the same ad- 
dress space of a single process. Compared 
with traditional multiprocess programming 
with interprocess communication means, 
threads in a multithread process facilitate 
more efficient session control, simpler cod- 
ing, and better scalability. 


User Validation Engine (UVE) 

When the first request for a given vendor’s 
web server is sent to webrelay, the program 
decodes the virtual directory to get the 
db_key. Based on the db_key, webrelay 
finds the real URL of the vendor's web serv- 
er and the mandatory-authentication flag 
from a lookup table stored in memory 
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(continued from page 88) 

that has been loaded from a configura- 
tion file at the start up of the program. 
In addition to IP address checking as re- 
quired by the majority of the vendors, 
the library sometimes requires mandato- 
ry authentication for a given vendor. Why? 
Because there are instances when a fee is 
required for a document delivery service 
associated with that vendor. 

If the client’s IP address is correct (from 
our campus, in other words) and no 
mandatory authentica- 
tion is required for the 
destination web server, 
the webrelay simply 
redirects the client to 
the destination. From 
then on, the client does 
transactions directly 
with the vendor. This 
eliminates all traffic that 
involves on-campus 
users. Otherwise, web- 
relay checks to see if a 
session has been es- 
tablished. If not, the 
UVE engine sends out 
an authentication chal- 
lenge to the client. You 
can choose to use ei- 
ther the basic or custom authentication 
scheme; the latter is preferred. In the case 
of basic authentication, the client sends out 
the user ID/password for all subsequent 
requests, which defeats the purpose of our 
session-control mechanism, where the SCE 
engine needs no more than a session key 
to keep track of all requests. With custom 
authentication, the UVE sends out the chal- 
lenge in an HTML logon form asking the 
client to submit its credentials (we require 
a user ID/password for now). Once the 
UVE receives the credentials from the 
client, it checks with a remote authentica- 
tion server where user IDs/passwords are 
stored and retrieved. We use a commer- 
cial server for that purpose. Available elec- 
tronically is a testing module that takes a 
username and password; if the username 
is the same as the password, the user is 
regarded as a legitimate user. You should 
customize the code to interface to any 
plausible authentication server one might 
choose. 

In a case where multiple users share a 
public workstation, a user may use the 
browser’s Back button to go back to the 
logon form that was filled out by a pre- 
vious user who vacated the workstation. 
To prevent users from stealing other users’ 
authentication credentials for gaining ac- 
cess, the UVE sets a timestamp on the lo- 
gon page it issues. The form is invalidat- 
ed after a certain period of time, say, five 
minutes. Of course, this does not com- 
pletely solve the problem. 


Webrelay is 
designed to mirror 
whatever remote 
web servers you 
want to include 





Session Control Engine (SCE) 

If a client is successfully authenticated, 
webrelay registers the client with the SCE. 
The SCE assigns a unique session key to 
that session and stores the session start 
time and other pertinent information. A 
session key consists of a timestamp con- 
catenated with the hex digits of the 
client’s IP address. The session control 
information is stored in a hashtable with 
a separate-chaining linked list to resolve 
any collisions that might occur. The SCE 
uses the session key 
for lookup, update, 
retrieval, or delete op- 
erations from the 
hashtable. 

Fine- grained syn- 
chronization using the 
mutex of the POSIX 
pthread library has 
been made to protect 
the shared session 
control data in a mul- 
tithreaded environ- 
ment. Any thread at 
any moment can hold 
a mutex lock that 
locks a pointer to a 
node. While only one 
thread that holds the 
mutex lock holds the pointer to the node 
at any given moment, numerous threads 
may hold pointers to other nodes at the 
same time. This is certainly more efficient 
than coarse-grained synchronization meth- 
ods, but harder to code (see Thread 
Primer: A Guide to Multithreaded Pro- 
gramming, by Bill Lewis and Daniel J. 
Berg, Prentice Hall 1996). 

Cookie handling is an important aspect 
of the SCE. Webrelay has to take over 
cookie management for the client, because 
the cookie issued by a vendor's web serv- 
er is meant for webrelay, which is seen as 
a client by the vendor's web server. If web- 
relay were to pass that cookie to its client 
directly, the client would have thought 
that the cookie had been associated with 
webrelay, rather than the vendor’s web 
server. When sending a subsequent re- 
quest, the client would have fetched any 
cookies that are associated with webrelay. 
The vendor's web server would think that 
was not a correct cookie and refuse con- 
nection. Listing One shows how the SCE 
stores a cookie into the session control 
data, while Listing Two shows how it 
fetches the corresponding cookie on be- 
half of the client to be sent back to the 
vendor’s web server. 

The other important aspect of the SCE 
is the control of idle sessions. This is han- 
dled by a garbage sweeper behaving like 
a daemon thread. It wakes up every 300 
seconds to scan the entire hashtable to 
check when a client last accessed the 





vendor’s web server. If the last time the 
client downloaded a page or a file was more 
than, say, 15 minutes ago, the session is 
considered as being idle too long, and is a 
candidate to be removed from SCE. One 
catch, though, is that before the idle ses- 
sion can be removed from memory, the 
SCE must make sure that there is no other 
thread that is reading from or writing to that 
node in the hashtable. This is taken care of 
by a reference count. The reference count 
is initialized to zero at the beginning. When- 
ever a thread starts (stops) reading from or 
writing to that node, its reference count in- 
crements (decrements) by one. If (and only 
if) the reference count reaches zero can the 
garbage sweeper remove that node from 
iné SCE, 

An idle session not only consumes com- 
puter resources, but also is itself a really 
annoying problem surrounding use of a 
public workstation in a public library 
shared by many random users. If the SCE 
does not purge the idle session, other 
users might be able to use the session left 
over by a previous user without being 
subject to any authentication. The garbage 
sweeper helps to alleviate this problem. 


Relay and URL 

Conversion Engine (RURLCE) 

As Figure 2 illustrates, the RURLCE consists 
of a REQuest Header Analyzer (REQHA), 
RESponse Header Analyzer (RESHA), and 
Response Entity-Body Converter (REBC). 
The REBC is able not only to convert a stat- 
ic HTML page, but also to convert dynamic 
pages generated by a JavaScript. 


e REQuest Header Analyzer (REQHA). 
The REQHA analyzes the request head- 
er. It fetches the virtual directory from 
the first header line. If the virtual di- 
rectory contains a string of “DB=db 
_key”, it asks the UVE to start the au- 
thentication process. Once a new ses- 
sion is started, the REQHA gets the real 
URL of the web server that a vendor has 
advertised in its contract from the lookup 
table. With the real URL, the REQHA 
constructs a new first request header 
line using the path of the real URL, and 
a new “Host:” header line with the real 
hostname and port number of the ven- 
dor’s web server. 

When the virtual directory does not 
start with a string of “DB=”, it then must 
contain a session key, or a session key 
followed by a hostname and port num- 
ber. The requested URL would look like 


http://webrelay.host.name:port/ses_key/ 
targetpath 


Or 


http://webrelay.host.name:port/ 
ses_key=another.host.name:targetport/ 
targetpath 
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If the virtual directory contains only 
ses_key, that means the targeted machine 
remains the same as the web server ad- 
vertised in the contract by the vendor. The 
REQHA sends the session key to the SCE, 
which does all the session control check- 
ing, and also decodes the requested URL 
to get the target path on the vendor’s web 
server with the virtual directory removed. 
The REQHA then uses the session key to 
obtain the db_key from the session con- 
trol data, from which it can find the real 
hostname and port number of the ven- 
dor’s web server. If the session key in the 
virtual directory is followed by another 
hostname and port number, that means 
the vendor now delegates the other web 
server to handle the request. In this case, 
the vendor’s original web server listed in 
the contract is no longer relevant. The ses- 
sion key is still used by the SCE to do var- 


ious checking on the session validity, 
while the REQHA uses the designated 
hostname and port number in the virtual 
directory to construct the first request 
header line and the “Host:” header line. 

The REQHA also removes any “Cook- 
ie:” request header line, because the cook- 
ie fetched by the client is not necessarily 
associated with the vendor’s web server, 
but rather with webrelay. The REQHA will 
always ask the SCE to see if there is a rel- 
evant cookie stored in the session con- 
trol data that was issued by the vendor's 
web server. If there is one, the SCE will 
retrieve the appropriate cookies based on 
matching domains and paths (Listing 
Two). The fetched cookies will be used 
by the REQHA to construct a new “Cook- 
ie:” header line. 

If webrelay is started to use the Basic 
Authentication scheme, the REQHA will 
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Figure 2: Flowchart of the Relay and URL conversion engine. 
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fetch the authentication data from the 
header and send it to the UVE for user 
validation. 


e RESponse Header Analyzer (RESHA). 


The RESHA analyzes the response 
headers returned by the vendor’s web 
server. It extracts any cookie in the 
“Set-Cookie:” header line issued by the 
remote web server and calls the SCE 
to store that cookie (Listing One). If 
there is a “Location:” header line, the 
RESHA extracts the redirect URL from 
that header line and calls the automatic 
redirection module to do a redirection 
right away. That automatic redirection 
module also asks the SCE to take care 
of the cookies before sending out the 
redirection request. The RESHA ex- 
tracts the “Content-Type:” header line 
to get the content type for later use by 
the RURLCE engine. It also extracts the 
content length as stated in the “Content- 
Length:” header line. The content 
length will be used by the RURLCE en- 
gine to facilitate reading the entity- 
body from the remote web server. The 
content length will usually need to be 
updated after the conversion of the 
entity body before sending back to the 
client. 

Response Entity-Body Converter 
(REBC). The Response Entity-Body 
Converter (REBC) is the most complex 
in this project. It is essential that the 
rewriting of all original URLs in a page 
fetched by webrelay be made to map 
to the virtual directory of the host 
where webrelay is running. It is not 
that difficult to do this for a static page. 
However, more and more vendors have 
started using Javascript to produce dy- 
namic pages. It isn’t easy to make sure 
that a dynamic page, generated by a 
Javascript or whatever other means, 
correctly maps to the virtual directory. 
You are dealing with a full-fledged 
programming language in the case of 
Javascript. Furthermore, decisions on 
how to make the rewriting have to be 
made based on not only a lexical but 
also a contextual analysis. Neverthe- 
less, the REBC I have developed in this 
project is able to do a fairly good job 
of supporting the third-party services 
the University of Calgary has sub- 
scribed to. 


The REBC basically consists of a con- 


verter for a static page, and a set of func- 
tions to deal with a dynamic page, con- 
taining mainly Javascripts. 


The converter for a static page scans 


the page to look for various HTML tags 
and the corresponding attribute that may 
have a URL as its value. We distinguish 
three different situations: a relative path 
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of a relative URL (without a leading slash), 
an absolute path of a relative URL (with 
a leading slash), and an absolute URL. First 
the REBC either inserts a base URL or 
modifies the existing base URL in the 
HEAD section of the page to ensure that 
the new base URL points to the virtual di- 
rectory of webrelay. This base URL almost 
eliminates the need to rewrite a relative 
path in a URL, because that relative path 
will be relative to the directory part of the 
base URL. However, it has to rewrite an 
absolute path of a relative URL, because 
the virtual directory in the newly con- 
structed base URL interferes with the stan- 
dard algorithm for figuring out the correct 
absolute URL from a relative URL. If a 
proper rewriting is not done, the standard 
algorithm would result in an absolute URL 
where the virtual directory would be left 
out. Consequently, when the client clicks 
on that hyperlink, the session key con- 
tained in the virtual directory would be 
lost. For example, suppose the original 
relative URL is in the form of 


/dir1/dir2/file.html 


while the inserted base URL is: 


http://webrelay.host.name:port/ses_key/ 
targetdir/targetfile 


The resulting URL based on the standard 
algorithm will become: 


http://webrelay.host.name:port/dir1/dir2/file 
html 


and the ses_key is lost. Therefore, you have 
to ensure that the REBC should rewrite 
this URL with: 


http://webrelay.host.name:port/ses_key/dir1 
/dir2/file.html 


The REBC also has to rewite any abso- 
lute URL to change the hostname and port 
number and to insert the virtual directo- 
ry in front of the path. For example, sup- 
pose the original absolute URL is: 


http://another.host.name:targetport/dir1/dir2 
/file.html 


The resulting URL after rewriting should 
look like: 


http://webrelay.host.name:port/ 
ses_key=another.host.name:targetport/dir1/ 
dir2/file.html 


When the converter for a static page is 
parsing the page, it also finds out other 
information for later use by the convert- 
er for a dynamic page. For instance, it 
scans Over any invocation of a Javascript 
function to obtain the function name as 
well as an argument that passes a URL. 
This information is passed to the con- 
verter for a dynamic page. Whether this 
argument value should be rewritten de- 
pends on the relationship of this function 
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argument to other elements in an assign- 
ment statement inside the definition of 
the corresponding Javascript function that 
will be analyzed by the dynamic page 
converter. 

The dynamic page converter deals 
with Javascript function arguments, user- 
defined variables, navigator objects, 
forms, and event handlers. A balance 
must be made in choosing only neces- 
sary items to work on, instead of using 


Cookie handling is 
an important aspect 
of the SCE 





a full-scale language parser. For exam- 
ple, you may only be interested in the 
location and window objects value to 
which a URL could be assigned, and 
leave other navigator objects untouched. 
If a location.ref object is assigned a val- 
ue that is taken from an Option list of 
the Form->Select element, then the URLs 
of the Options of the corresponding 
Form selected by a client must be rewrit- 
ten. If, however, a URL in an Option list 
of a Form is going to be used by a CGI 
script defined in the action attribute of 
that Form, then one should not rewrite 
that URL at all, because the CGI script 
will be run on the vendor’s server ma- 
chine, rather than on the client. 

Inside a definition of a Javascript func- 
tion, if an argument is used as the first term 
of an assignment statement, and that argu- 
ment is passed a URL value, the dynamic 
page converter informs the static page con- 
verter to rewrite that URL. When a proper- 
ty of a location object appears on the right 
side of an assignment statement, the dy- 
namic page converter does a careful anal- 
ysis of relationships of various terms with 
the location object and decides how to 
rewrite the assignment as a whole. The tricky 
thing here is that the location object is re- 
ferred to the “real” location object in the 
original page dispatched by the vendor’s 
web server, and its value must be rewritten 
to point to webrelay with the virtual direc- 
tory inserted in front of the target path. 

In the case of an assignment statement 
for a user-defined Javascript variable, the 
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insertion of a rewritten base URL in the 
HEAD section of a page helps resolve am- 
biguity between a string literal and a file- 
name, because the REBC does not have 
to explicitly rewrite the relative path (a 
filename alone consists of a relative URL), 
which is taken care of by the inserted 
rewritten base URL. 

The rewriting done by the REBC on- 
the-fly makes sure that the converted page 
presented to the client contains all hy- 
perlinks that point to webrelay and have 
the right session key included. This en- 
sures that subsequent requests sent by the 
client be forced to go through webrelay, 
and the session key can be used by web- 
relay to track the session. 


Conclusion 

Webrelay works efficiently to handle thou- 
sands of hits per day and is scalable, sup- 
porting as many remote vendor web 
servers as you want. It is easy for a non- 
technical person to configure. All you have 
to do is be able to add or delete web 
servers from the configuration file, or de- 
cide whether you want mandatory au- 
thentication for any given web server. Its 
session control data is stored in memory 
in the same address space of a single pro- 
cess, so that multiple threads can access 
the data efficiently. The session control 
module permits legitimate university users 
to be able to use the services that the uni- 
versity subscribes to at any time from any 
ISP. They are only asked once for au- 
thentication at the start of access to a giv- 
en web server, in subsequent transactions 
there is no need for the client to send in 
authentication credentials in the case of 
the custom authentication. The session 
control engine checks the session duti- 
fully. Both static and dynamic page con- 
verting are supported, which makes the 
mechanism successful. 
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Listing One chain_node_t *cnp; 
int status; 
/* Update cookie in the session control data */ 
int sess_manager_update_cookie(char *seskey, unsigned int keylen, cnp = sess_manager_find(seskey, keylen) ; 


{ 


accept_info *aip, relay_info *rip) 


chain_node_t *cnp; 
int status; 


cnp = sess_manager_find(seskey, keylen) ; 
if(cnp != NULL) { 
sess_info_t *sip; 


Spthread_mutex_lock (&cnp->lock) ; 


sip = 


(sess_info_t *) cnp->data; 


if(sip != NULL && sip->ClientIPAddr) { 
if ((aip->cliaddr->sin_addr.s_addr == sip->ClientIPAddr)) { 
time_t ct; 
time (&ct) ; 
if ( difftime(ct, cnp->LastUpdated) <= sess_manager_refresh ) { 


if(cnp != NULL) { 
sess_info_t *sip; 
Spthread_mutex_lock(&cnp->1lock) ; 


sip 


= (sess_info_t *) cnp->data; 


if(sip != NULL && sip->ClientIPAddr) { 


if ((aip->cliaddr->sin_addr.s_addr == 


sip->ClientIPAddr)) { 
time_t ct; 
time (&ct) ; 
if ( difftime(ct, cnp->LastUpdated) <= sess_manager_refresh ) { 
char *targethost = NULL; 
char *targetpath = NULL; 
int i, len, leni, len2, old_len, num_entries; 
int ck_dom_len, targethost_len; 


/* Session still valid. */ 
if(rip->redir_targethost != NULL) 


char *cookie_cookie = NULL; targethost = xstrdup(rip->redir_targethost) ; 
char *cookie_name = NULL; else 
char *cookie_path = NULL; targethost = xstrdup(rip->targethost) ; 


char *cookie_domain = NULL; if(rip->redir_targetpath != NULL) 
char *p; targetpath = xstrdup(rip->redir_targetpath) ; 
int i, j, leni, len2; else 
/* Session still valid.*/ targetpath = xstrdup(rip->targetpath) ; 
cenp->LastUpdated = ct; FREE_MAYBE (rip->cookie) ; 

old_len = Q; 


/* rip->cookie: NAME=VALUE; PATH=/pathi/path2, while 
cookie_cookie contains NAME=VALUE, and cookie_path 
is /path1/path2 */ 
cookie_cookie = 
parse_cookie(rip->cookie, &cookie_path, &cookie_domain) ; 
if ((p = strchr(cookie_cookie, '=')) != NULL) 
/* cookie_name is NAME */ 
cookie_name = strdupdelim(cookie_cookie, p); 


if(sip->cookie_path[@] == NULL && sip->cookie_name[@] == NULL && 
sip->cookie_value[®@] == NULL) { 
/* There is no existing cookies in the SIP yet. Simply insert 


num_entries = @; 
for(i = 0;i<MAX_NUM_COOKIE && sip->cookie_path[i] != NULL; ++i) { 
/* First match the domain */ 
targethost_len = strlen(targethost) ; 
if (sip->cookie_domain[i] != NULL) 
ck_dom_len = strlen(sip->cookie_domain[i]); 
else 
goto Match_path; 


/* Consume chars one by one from the end of the cookie_domain */ 
while(--ck_dom_len >= @ && --targethost_len >= @) { 
if (sip->cookie_domain[i] [ck_dom_len] != 


the new cookie into it. */ 
sip->cookie_path[@] = xstrdup(cookie_path) ; 
sip->cookie_name[@] = xstrdup(cookie_name) ; } 
sip->cookie_domain[Q] = xstrdup(cookie_domain) ; if(ck_dom_len > @) { 
sip->cookie_value[@] = xstrdup(cookie_cookie) ; /* No match of domain, search the next entry */ 
} else { continue; 
/* Match the existing cookies already stored in SIP */ } 
for (i=; i<MAX_NUM_COOKIE && sip->cookie_path[i] != NULL; ++i) { /* Match the path */ 
lent = strlen(cookie_path) ; Match_path: 
if (!strncasecmp(cookie_path, sip->cookie_path[i], len1)) { lent = (strlen(sip->cookie_path[i]) < strlen(targetpath)) ? 


targethost [targethost_len] ) 
break; 


for (j=i; j<MAX_NUM_COOKIE && sip->cookie_name[j] !=NULL;++j strlen(sip->cookie_path[i]) : strlen(targetpath) ; 
){ if(!strncasecmp(targetpath, sip->cookie_path[i], len1)) { 
len2 = strlen(cookie_name) ; num_entriest++; 
if(!strncasecmp(cookie_name, sip->cookie_name[j], len2) && len2 = strlen(sip->cookie_value[i]); 
!strncasecmp(cookie_path, sip->cookie_path[j], leni)) { 
/* Overwrite this cookie */ if(num_entries == 1) { 
FREE_MAYBE (sip->cookie_value[j]); len = len2; 
/* We store NAME=VALUE together as one single cookie */ rip->cookie = Smalloc(len) ; 
sip->cookie_value[j] = xstrdup(cookie_cookie) ; memcpy (rip->cookie, sip->cookie_value[i], len); 
break; } else { 
} len = old_len + 1 + 1 + len2; 
} rip->cookie = Srealloc(rip->cookie, len); 
/* No match of cookie_name, regarded as a new cookie of memcpy (rip->cookie + old_len, "; ", 2); 
the same path. Now we ADD this new cookie at j */ memcpy (rip->cookie + old_len + 2, sip->cookie_value[i], len2); 
sip->cookie_path[j] = xstrdup(cookie_path) ; 
sip->cookie_name[j] = xstrdup(cookie_name) ; old_len = len; 
sip->cookie_domain[j] = xstrdup(cookie_domain) ; } 
sip->cookie_value[j] = xstrdup(cookie_cookie) ; } 
break; if(num_entries > @) { 
J rip->cookie = Srealloc(rip->cookie, len + 1); 
} rip->cookie[len] = '\Q'; 
if (sip->cookie_path[i] == NULL && sip->cookie_name[i] == NULL) { } 
/* No match either of cookie_name nor cookie_path. FREE_MAYBE(targethost) ; 
This is a new cookie of FREE_MAYBE(targetpath) ; 
a new path. Now we add this new cookie at i */ status = SES_OK; 
sip->cookie_path[i] = xstrdup(cookie_path) ; } else { 
sip->cookie_name[i] = xstrdup(cookie_name) ; status = SES_TIMEOUT; 
sip->cookie_domain[i] = xstrdup(cookie_domain) ; } 
sip->cookie_value[i] = xstrdup(cookie_cookie) ; } else 
} status = SES_CLIENT_ENDS; 
} } else 
FREE_MAYBE (cookie_name) ; status = SES_CLIENT_ENDS; 
FREE_MAYBE (cookie_path) ; Spthread_mutex_unlock(&cnp->1lock) ; 
FREE_MAYBE (cookie_domain) ; chain_hash_release(cnp) ; 
FREE_MAYBE (cookie_cookie) ; } else 
status = SES_CLIENT_ENDS; 
cnp->data = (void *) sip; return status; 
status = SES_OK; } 
} else 
status = SES_TIMEOUT; DDJ 
} else 
status = SES_CLIENT_ENDS; 
} else 


status = SES_CLIENT_ENDS; 
Spthread_mutex_unlock (&cnp->1lock) ; 
chain_hash_release(cnp) ; 
} else 
status = SES_CLIENT_ENDS; 
return status; 
} 


Listing Two 


/* Retrieve cookie from session control data */ 

int sess_manager_retrieve_cookie(char *seskey, unsigned int keylen, 
accept_info *aip, relay_info *rip) 

{ 


int i, lent, len2, len; 
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isio is generally considered a gener- 

ic illustration tool for creating 

flowcharts, organizational charts, 

timelines, and the like. While Visio 
(from Visio Corp., http://www.visio.com/), 
although recently acquired by Microsoft, 
is an excellent tool for applications such 
as these, it is nonetheless built around a 
powerful visualization engine, making it 
ideal for visualizing and diagramming net- 
works, software, databases, and other such 
systems. 

With Visio, you can use drag-and-drop 
to assemble diagrams from a multitude of 
prebuilt shapes. Each shape consists of a 
number of properties that control a shape’s 
appearance and response to stimuli. Visio 
presents the editable properties in a 
ShapeSheet editor that looks somewhat 
like a normal spreadsheet. In addition to 
simple numeric values, a property (or cell) 
can contain a formula that derives the val- 
ue through some computation. 

These shapes can be transformed, con- 
nected to other shapes, and grouped. Fur- 
thermore, you can construct custom shapes 
from scratch or by modifying/extending 
those in the box. You can then use sten- 
cils to collect shapes for a particular pur- 
pose. In this article, for example, I'll use 
a network stencil that contains many 
shapes to represent the objects typically 
found on computer networks. In addition 
to stencils, Visio supports the notion of 
templates that are empty (or partially built) 
diagrams to which one or more stencils 
can be attached. In short, a Visio template 
can be used in much the same way as a 
Microsoft Word or Excel template. 


Chris is the cofounder of Wave Software 
(http://www.wavesofitware.com/), which 
specializes in software development for 
Windows and the Internet. He can be 
reached at ctrueman@wavesoftware.com. 
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Visualizing Network 


Resources Using Visio 


Using drag-and-drop to 
assemble development diagrams 


Chris Trueman 


To illustrate the use of Visio in a soft- 
ware development environment, I'll build 
the presentation layer for network resource 
discovery and visualization. While this ar- 
ticle is based on Visio Professional 5.0, 
there are several different versions of the 
package, ranging from Visio Standard Edi- 
tion to Visio Enterprise 5.0. In addition, 
there are add-ons, such as the Visio So- 
lutions Library and the Visio Network 
Equipment toolkit (with various network 
device shapes). 

There are three different ways to build 
a Visio-based solution: 


e Write a special kind of DLL called a “Vi- 
sio Solutions Library” (VSL). 

e Utilize the embedded Visual Basic for 
Applications (VBA) development envi- 
ronment. 

e Create a separate executable that drives 
Visio through its automation API. 


I'll focus VSLs using C++. That doesn’t 
mean I don’t use VBA. In fact, I always 
start a Visio-based project by prototyping 
the automation usage in VBA because it 
is quick and easy. The embedded nature 
of VBA means that solutions developed 
require a template (or diagram) in which 
to store the code. Likewise, implementing 
a separate executable that drives Visio is 
similar to writing a VSL, except that Visio 
runs in a different process. Thus the com- 
munication between the two applications 
is slightly slower. I’ve written solutions of 
this form using both C++ (using the wrap- 
per classes) and Visual Basic. 

Except for VBA, your choice of devel- 
opment tool is restricted to those that are 
capable of calling through OLE automa- 
tion interfaces. However, Visio makes it 
easier to create VSLs if you are using Mi- 
crosoft Visual C++ by including a custom 
App Wizard. All of the aforementioned al- 
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ternatives utilize the Visio Automation API 
to handle the interactions between your 
code and the Visio environment. 


Visio Solutions Library 

A VSL is a standard Windows DLL that 
contains one or more add-on objects. It 
is identified by the .VSL suffix and the sin- 
gle export: 


VAORC  VisioLibMain(VAOMSG wMsg, 
WORD wParam, LPVOID |IpParam) 


I’ve removed some of the additional 
preprocessor elements that control its ex- 
port and calling convention. In form, it 
looks like a Windows message handler; 
indeed, this is how it operates in practice. 
Once the VSL has been loaded, Visio sends 
it messages through this function to in- 
form it of actions and events that have oc- 
curred. As a writer of a VSL, not all the 
messages will be of interest, so Visio in- 
cludes a default message handler (sounds 
a lot like Def WindowProc) called VAOUtil 
_Def VisMainProc. 

It is possible for one VSL to contain mul- 
tiple add-on objects. When Visio calls 
through VisioLibMain, it sets the wParam 
parameter to be the identifier of the add- 
on that it is communicating with. When the 
add-on first registers with Visio, it is as- 
signed a session-wide unique identifier. If 
you write add-ons and VSLs from scratch, 
it is your responsibility to ensure you record 
this information and process a standard set 
of messages in your VisioLibMain. Alter- 
natively, you can use C++ and leverage the 
wrappers and Wizard that Visio provides. 

Wizards are included for Visual C++ 4 
and 5 (Version 5 Wizards also work with 
Version 6.0). When run, Visio creates a pro- 
ject that resembles the standard Win32 DLL 
project. However, the Visio-created project 
contains a prebuilt VAddon derivative you 
use as the basis for your application. The 
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(continued from page 98) 
project will compile and link successfully 
and, if installed, the add-on will popup a 
message box when invoked. 

The most important VAddon override 
is the Run method, which is called when 
users invoke your add-on from the Visio 
UI (or programmatically). Example 1 
shows the Run method implementation in 
my network resource discovery VSL. The 
VAddon source file includes a stock im- 
plementation of the VisioLibMain that 
routes messages to virtual methods de- 
fined on the VAddon class. By overriding 
these methods in your derivatives you con- 
trol how your add-on reacts to messages 
sent from Visio. 

The Wizard is only capable of generat- 
ing a project with one add-on. If you want 
the VSL to contain multiple add-ons then 
they must be cranked out by hand. Study 
the VAddon class and, in particular, how it 
is registered before starting down this path. 

In addition to providing a C++ wrap- 
per for writing add-ons, Visio includes an 
entire library of classes that encapsulate 
its automation API. This library saves you 
from having to spend too much time wor- 
rying about the details of the interfaces 
(in particular Addkef and Release) and lets 
you concentrate on the important job of 
writing your solution. 

All the files ’'ve mentioned —Wizard, 
C++ wrappers, and so on— are not in- 
stalled by default. You must select the 
custom install and make sure that the 
“Developing Visio Solutions” option is 
checked. You can tell if your installation 
includes these files by looking for a DVS 
subdirectory in the Visio directory. 


: 
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Table 1: Layout settings for Figure 1. 
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You configure the Visual C++ project 
to debug a VSL in exactly the same way 
you would for any normal Win32 DLL. 
Under the “Debug” option for the pro- 
ject set the “Executable for debug ses- 
sion” to <VISIO>\VISIO32.EXE where 
<VISIO> has been substituted for the Vi- 
sio installation path on your system. 

To install a VSL, Visio maintains a list 
of directories that it searches on startup 
for templates, stencils, and VSLs. By de- 
fault, every installation includes a Solu- 
tions subdirectory. (In the past, I’ve cre- 
ated a subdirectory under Solutions called 
“IntelliCorp,” the company I work for, for 
storing our VSLs.) 

If you are running Visio in Developer 
Mode, then each VSL that has been 
marked visible can be found under 
Tools>>Addons. If the Addons menu is 
not present, look under Macros or enable 
Developer Mode from the Advanced Op- 
tions dialog. If you’re building solutions, 
you really should be running in Devel- 
oper Mode, as this enables a number of 
useful short-cut menus— most notably the 
Show ShapeSheet item on the Shape Con- 
text menu. 

You are not constrained to install your 
solution in the Solutions directory struc- 
ture. However, if an alternative directory 
is used, you must ensure that the search 
path is updated. You can either do this 
through the Visio UI or programmatical- 
ly by editing the Visio.ini file. Changing 
the path through the UI causes Visio to 
automatically update its cache of VLSs, 
stencils, and templates. Programmatically, 
for example, perhaps you have written an 
installation script for your solution, then 
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remember to change the BuildDirectory- 
Cache entry in the INI file to 0 as this will 
instruct Visio to update its cache when 
next started. 


Pulling the Resources Together 

To illustrate the ideas presented here, I’ve 
written a C++ VSL that generates diagrams 
to represent all the available resources on 
a Windows network. The VSL utilizes the 
Visual C++ 5.0 Wizard, C++ wrappers, a 
standard Visio template, and some Win32 
API calls. 

Figure 1 diagrams my home network. 
The real meat of the diagram starts with 
the cloud (ICBRISTOL); this is my Work- 
group. The two machines it contains 
(DEEPTHOUGHT and DIABLO) are 
shown with their shares. Directory and 
printer shares each get their own shape. 

In the interest of clarity, the code con- 
tains only a few debug ASSERTs and al- 
most nothing in terms of error handling. 
If you start copying/pasting code into 
your solution, your first exercise would 
be to introduce some error handling. 
Also, I divided the VSL engine into a 
CNetworkResourceCollector and CNet- 
workkResource class. 


CNetworkResourceCollector 

Listings One and Two present the CNet- 
workkesourceCollector class. The point of 
entry to this object is its Run method. The 
actual processing is then split into the col- 
lection of resource data and its presenta- 
tion. The Collect method is surprisingly 
short and uses recursion to walk the hi- 
erarchy of resources found in a network. 
In general, Windows networks are formed 
as follows: 


Windows Network 
Domain 
Computer 
Share (represents both directories and 
printers) 


The core Win32 API functions used to 
traverse this structure are WNetOpenEnum, 
WNetCloseEnum, and WNethnumResource. 
Before the advent of the COM task allo- 
cator, it was typical for functions that re- 
turned variable length data to require two 
calls. The first call determined how much 
memory would be required to hold the 
returned data. The second call actually 
obtained the data. The WwNet functions 
work in this way. WNetEnumResource re- 
turns ERROR_MORE_DATA when it needs 
more memory to complete successfully. 
A common bug (at least it used to be) is 
illustrated in realloc call in this method; 
see the code comments for more details. 

As each resource is found, an instance 
of CNetworkResource is created to en- 
capsulate the important details of the 
NETRESOURCE structure. I maintain the 
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Figure 1: Network resource diagram. 


hierarchy by passing a parent resource 
through each successive call to Collect. 
Actually, this explains the The Network 
Universe shape at the top of the diagram. 
It makes the code in Collect much sim- 
pler during its first invocation if a real 
CNetworkResource instance is supplied. 
If I didn’t include this I'd have to pro- 
gram for a special case and this makes 
the code larger, more cumbersome, and 
can point to design deficiencies. 

Once all the resources have been col- 
lected, the Display method is called, which 
is responsible for loading the Visio net- 
work template, calling on each of the re- 
sources to draw themselves, and finally to 
organize the shapes using Visio’s built-in 
auto layout routine. 


CNetworkResource 

In addition to wrapping the significant 
elements of the NETRESOURCE struc- 
ture, the CNetworkResource (available 
electronically; see “Resource Center,” 
page 7) provides methods to create the 
Visio representation of the resource and 
to add new resources to it (this is how 
the hierarchy is maintained in the data 
structure). Each type of resource gets its 
own shape. 


Generating the Diagram 

I used the Basic Network template that 
ships with Visio 5.0 Professional as the 
basis for the network resource diagram. 
The stencils that it uses provide a good 
range of shapes that can be used to rep- 
resent the resources of the network. In 
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particular, I used the Desktop PC (com- 
puter), Straight Bus (network), Cloud (do- 
main), City (root— network universe), 
Printer (shared printer), and Tape Drive 
(shared directory). The Tape Drive shape 
is actually used for directory shares be- 
cause there is no hard-disk shape in the 
stencil; odd, given that there is one for 
floppy drives. 

To make it easier to change the shapes 
used to represent each resource, I cre- 
ated a resource to shape map (see 
ShapesInc.cpp; available electronically). 
The g_Shapes global vector holds the 
shape index used for each type of re- 
source. For the most part, determining 
the type of resource is a simple matter 
of looking at the dwDisplayType attribute 
of the NETRESOURCE structure (or call- 
ing GetDisplayType on a CNetworkRe- 
source instance). Unfortunately, in the 
case of shares this always returns RE- 
SOURCEDISPLAYTYPE_SHARE. For this 
reason, I store the dwType element (Get- 
Type) and use it to distinguish directory 
and printer shares. 

In my tests, only the following resource 
types were encountered: 


RESOURCEDISPLAYTYPE_SHARE 
RESOURCEDISPLAYTYPE_DOMAIN 
RESOURCEDISPLAYTYPE_NETWORK 
RESOURCEDISPLAYTYPE_SERVER 


The production of the diagram falls 
neatly into these steps: 


1. Create an instance of a shape for each 
resource. 
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2. Set the text of the shape. 

3. Assemble the hierarchy of links between 
the resources. 

4. Layout the diagram. 


Fortunately, Visio provides a number of 
autolayout algorithms, so making the dia- 
gram look pretty is simply a matter of call- 
ing the right method. As I’ve said, Visio 
handles all the dirty work of laying out the 
shapes and links. After trying the different 
options, I settled on those in Table 1. 

Interestingly, the layout properties are 
stored in the User Defined properties sec- 
tion on the Page object. You can see this 
by opening the ShapeSheet editor for the 
page. You configure the settings by updat- 
ing the contents of each User Defined prop- 
erty cell. 

Figure 1 took about 20 seconds to pro- 
duce. Running it on the INTELLICORP do- 
main remotely via a dial-up ISDN con- 
nection took around 15 minutes. A fair 
percentage of the time is spent executing 
the autolayout function — reasonable con- 
sidering the number of machines and 
shares present in the domain. 


Conclusion 

The underlying principle of this article 
is to reinforce the idea of component- 
based development— the utilization of 
prebuilt, quality modules that can be 
rapidly assembled to create complete 
software solutions. 


DD 
(Listings begin on page 102.) 
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Listing One 


// NetworkResourceCollector.h: interface for CNetworkResourceCollector class 


FILITTLTLTTATTTTTTTTTTT TLL TTT TAT TTT TTT TTT ATT TTA TTT TT TAT 


#if !defined (AFX_NETWORKRESOURCECOLLECTOR_H__C855318A_17FC_11D3_B3C5_ 
@9105A98B1@8__INCLUDED_) 

#define AFX_NETWORKRESOURCECOLLECTOR_H__C855318A_17FC_11D3_B3C5_ 
@0105A98B108__INCLUDED_ 

#if _MSC_VER >= 1800 

#pragma once 

#endif // _MSC_VER >= 1000 


#include <list> 


class VNetInfo; 
class CNetworkResource; 


LLLTTLTTLLT LTT LTT LTT ATT TTT TATA 
// CNetworkResourceCollector 
class CNetworkResourceCollector 


Collect(res, buffer) ; 
break; 
} 
case ERROR_MORE_DATA: 
// For clarity I am not protecting against realloc returning 
// NULL - should it do so then we'll lose the memory already 
// allocated to buffer. Be warned! 
buffer = (NETRESOURCE*) realloc(buffer, bufferSize) ; 


ASSERT(NULL != buffer); 
ZeroMemory(buffer, bufferSize) ; 
break; 
default: 
break; 
} 
} while (ERROR_NO_MORE_ITEMS != rc); 
free(buffer) ; 
buffer = NULL; 
: :WNetCloseEnum(enumHand1le) ; 
} 
return (TRUE) ; 


{ } 
public: // C'tor/d'tor. BOOL CNetworkResourceCollector: :Display() 
CNetworkResourceCollector(VNetInfo* addon) ; { 
~CNetworkResourceCollector(); CVisioApplication app; 
public: // Operations. m_Addon->GetApp (app) ; 
BOOL Run(); CVisioDocuments documents; 
private: // Implementation. HRESULT hr; 
BOOL Layout (CVisioPage& page) ; hr = app.Documents (documents) ; 
void DeleteNetworkResources(); if (SUCCEEDED (hr) ) 
CNetworkResource* CreateNetworkResource(NETRESOURCE* buffer) ; { 
BOOL Display(); CVisioDocument document; 
BOOL Collect (CNetworkResource* parent, NETRESOURCE* nr = NULL); BSTR filename = CString("Basic Network.vst") .AllocSysString(); 
VNetInfo* m_Addon; hr = documents.Add(filename, document) ; 
// Having a "special" root object makes the Collect code simpler. FREE_BSTR (filename) ; ; 
CNetworkResource* m_RootResource; if (SUCCEEDED (hr) ) 
public: { 
#ifdef _DEBUG CVisioMasters masters; 
void Dump(); short documentsCount ; 
#else documents.Count (&documentsCount) ; 
#define Dump() for(int i = 1; i <= documentsCount; i++) 
#endif // _DEBUG { 
33 CVisioDocument document; 
#endif // |!defined(AFX_NETWORKRESOURCECOLLECTOR_H__C855318A_17FC_11D3_B3C5_ documents. Item(COleVariant((long) i), document) ; 
@010@5A98B108__INCLUDED_) BSTR bstrName; 
CString name; 
document .Name (bstrName) ; 
r . : name = eatery 
FREE_BSTR(bstrName) ; 
isting wo TRACE1("%s\n", name) ; 
// NetworkResourceCollector.cpp // We're looking for this specific stencil. 
SILLLLLTTTTTTTTTTTTTTTT ATTA LAATT TTT AATTTTATT TTT ATT TTT if(-1 != name.Find("basic network shapes.vss") ) 
{ 
#include "stdafx.h" document .Masters(masters) ; 
#include "VNetInfo.h" break; 
#include "visiwrap.h" } 
#include "NetInfo.h" } 
#include "NetworkResourceCollector.h" CVisioPages pages; 
#include "NetworkResource.h" hr = document. Pages (pages) ; 
#include "ShapesInc.h" if (SUCCEEDED (hr) ) 
{ 
#ifdef _DEBUG CVisioPage page; 
#undef THIS_FILE hr = pages.Item(COleVariant((long) 1), page); 
static char THIS_FILE[]=__FILE__; if (SUCCEEDED (hr) ) 
#define new DEBUG_NEW { 
#endif m_RootResource->Display (page, masters) ; 
Layout (page) ; 
CNetworkResourceCollector: : CNetworkResourceCollector(VNetInfo* addon) } 
{ } 
m_Addon = addon; } 
m_RootResource = new CNetworkResource(); ; 
} return (SUCCEEDED (hr) ) ; 
CNetworkResourceCollector: :~CNetworkResourceCollector() } 
{ CNetworkResource* CNetworkResourceCollector:: 
DeleteNetworkResources() ; CreateNetworkResource (NETRESOURCE* buffer) 
} { 
BOOL CNetworkResourceCollector: :Run() ASSERT(NULL != buffer); 
{ CNetworkResource* nr = new CNetworkResource (buffer) ; 
Collect (m_RootResource) ; ASSERT(NULL != nr); 
Display (); return(nr) ; 
return (TRUE) ; } 
} void CNetworkResourceCollector: :DeleteNetworkResources() 
BOOL CNetworkResourceCollector: :Collect (CNetworkResource* parent, { 
NETRESOURCE* nr /* NULL */) delete m_RootResource; 
{ m_RootResource = NULL; 
HANDLE enumHandle = @; J 
DWORD rc = Q; BOOL CNetworkResourceCollector: :Layout (CVisioPage& page) 
rc = ::WNetOpenEnum( RESOURCE_GLOBALNET, RESOURCETYPE_ANY, @, nr, { 
&enumHand1e) ; HRESULT hr; 
if (NO_ERROR == rc) CVisioShape shape; 
{ hr = page. PageSheet (shape) ; 
NETRESOURCE* buffer = NULL; ASSERT (SUCCEEDED (hr) ) ; 
DWORD bufferSize = Q; CVisioCell cell; 
bufferSize = sizeof (NETRESOURCE) ; ShapeGetCell(shape, _T("User.visControlsAsInputs"), cell); 
buffer = (NETRESOURCE*) malloc (bufferSize) ; CellSetFormula(cell, _T("0")); 
ASSERT(NULL != buffer) ; ShapeGetCell(shape, _T("User.visPlacementStyle"), cell); 
::ZeroMemory (buffer, bufferSize) ; CellSetFormula(cell, _T("1")); 
do ShapeGetCell(shape, _T("User.visPlacementDepth"), cell); 
{ CellSetFormula(cell, _T("1i")); 
DWORD count = 1; ShapeGetCell(shape, _T("User.visRoutingStyle"), cell); 
re = ::WNetEnumResource( enumHandle, &count, buffer, CellSetFormula(cell, _T("7")); 
&bufferSize) ; ShapeGetCell(shape, _T("User.visResize"), cell); 
switch(rc) CellSetFormula(cell, _T("-1")); 
{ page.Layout(); 
case NO_ERROR: return (TRUE) ; 
if } 
// buffer now describes a single network resource. 
CNetworkResource* res = CreateNetworkResource (buffer) ; 
ASSERT(NULL != res); 
parent->AddResource(res) ; DDJ 
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The Palm, 
the Nose, and 





Other Computing Platforms 


Michael Swaine 


n this month’s walkabout we'll observe 

some of the smaller fauna of the field. 

These nature hikes are always primar- 

ily about identification, and in this one 
you'll get tips on how to identify web sites 
by their distinctive aromas, as well as how 
to distinguish one cyberlinguistic species 
from another, so you will never again con- 
fuse Rebol with Java. At this time of year 
there’s even a chance that we'll catch a 
glimpse of an amateur mathematician. Let’s 
hope for the best. 


Designing for the Nose 
I never promised you a rose garden. 
Such could be the response of the com- 
puter industry when and if computer users 
rise up in Outrage against the stinking-up 
of the computer-using experience. When 
and if that happens. Of course DigiScents 
would prefer that you believe that its tech- 
nology will only deliver sweet and wel- 
come fragrances to your desktop, but 
don’t count on it. DigiScents (http://www 
.digiscents.com/) is a Silicon Valley com- 
pany that has figured out how to do what 
has only been the subject of jokes and 
hoaxes and fringe research in the past— 
it has learned how to scent-enable your 
computer. Or your television set or your 
game machine, for that matter. The DigiS- 
cents device, which they claim they are 
seriously planning to market under the 
name iSmell, takes some digital data as in- 
put and produces as output a wide vari- 
ety of selected smells. How wide a selec- 
tion is a particularity that will have to await 
the actual release of the product, but a 
bunch, apparently. 


Michael is editor-at-large for DDJ. He can 
be contacted at mswaine@swadine.com. 
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As. a video game machine peripheral 
iSmell could give a new dimension of re- 
ality to games, as a TV add-on it offers 
the promise of smell-o-vision (if that’s a 
Lockheed trademark I apologize in ad- 
vance), and as a computer peripheral it 
immediately suggests the possibility of 
scent-enabling e-commerce web sites. 


Just smell that cologne, that cognac, that 


Corinthian leather. 

Joel Lloyd Bellenson, one of the com- 
pany’s founders, took some academic re- 
search on how odoriferous molecules trig- 
ger smell receptors in the brain, then 
apparently drew the shortest path between 
that research and a marketable product, 
and he and partner Dexster Smith followed 
that path. Their goal is to synthesize all 
smells from a relatively small palette, like 
the color mixing in an inkjet printer. The 
full scent palette may need to be a hun- 
dred times larger than red-green-blue or 
cyan-magenta-yellow-black, but the idea’s 
the same. They have produced a box that 
is at least a first approximation to their 
goal: Send it the right signals and it emits 
the aroma of burnt wood, bananas, cheap 
perfume. The November 1999 issue of 
Wired magazine devoted a cover story to 
DigiScents. 

I take quite seriously the money-making 
possibilities of this DigiScents technology. 
Obviously it’s a boon to selling online prod- 
ucts for which the aroma is an important 
feature. But it could be more broadly use- 
ful in more a subtle way: to stir emotions 
subliminally in product ads and political 
messages. Smell also offers another modal- 
ity for storytelling in movies and games. 

One application that will probably be 
promoted by someone is aromatherapy, 
the alternative medical practice of pre- 
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scribing the sniffing of essential oils for 
health. I have real doubts about that one. 
If aromatherapy works at all it presumably 
depends on the effects of the specific oils 
and their chemical constituents. As I un- 
derstand it, DigiScents’ technology con- 
centrates on triggering the right receptors 
in the nose, and may get its effects via quite 
different chemicals than aromatherapy tra- 
ditionally uses. The opportunities for de- 
velopers include hardware devices, smell 
cartridges, and smell-enriched content. 
DigiScents will likely try to license its tech- 
nology broadly. 

I’m also inclined to take the emotion 
and memory angles seriously. In the 
November 1999 issue of Scientific Amer- 
ican you can read about Rachel S. Herz, 
a psychologist who is almost alone in do- 
ing serious research on the connection 


_ between smell and memory and between 
smell and emotion. The olfactory system 


is unique among the senses in being di- 
rectly connected to the limbic system— 
the amygdala, which is the emotional cen- 
ter in the brain, and the hippocampus, a 
memory center. The other senses are all 
mediated in their communication with the 
limbic system, but the olfactory system 
talks directly to memory and emotion; the 
limbic system actually grew out of the ol- 
factory system, in evolutionary terms. 

Taking off from this fact, Herz pro- 
pounds the radical idea that emotion is 
essentially the same as scent, that it is just 
another, more abstract, expression of the 
same information. 

On the other hand, she also says that 
her research shows that women consider 
scent the most important factor in mate 
selection, and men consider it the second 
most important factor. In a culture in 
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wynen you are ready to add imaging technology and you don't have either the time or the budget to code the imaging yourself, 
géetLEADTOOLS. We've written the code for you. It's ready to drop into your application, and with LEADTOOLS, you know 


you.re,adding the best code available...code used by Microsoft, Boeing, Intel, Hewlett Packard, most of the fortune 100 and 
-@very branch of the US government. Whether you need to add color or grayscale, medical, multimedia, document or vector 
imaging, LEADTOOLS provides the technology you need in an easy to use toolkit. 


Visit www.leadtools.com today, see for yourself...the code you need is all here! 


IMAGE PROCESSING 


_ Transforms ~ Resample or rotate (.01 degree precision) with normal/bilinear/bicubic 
__ interpolation, flip, invert, reverse, crop, underlay, shear, transpose, stretch intensity, fill, 
___ auto deskew and combine (mathematical, Boolean operations), Iso contour. 


_ Filters - Sharpen, blur, average, edge detect, line detect, emboss, mosaic, posterize, 


median and noise filters, spatial filter (pre-defined such as gradient, laplacian, sobel, 
prewitt, shift and difference, line segment, or custom), and more. 

Drawing - Draw directly to a bitmap using Windows GDI (TextOut, BitBit, Ellipse, 
Rectangle, etc.) 

Region of interest - Process only a specific portion of an image or the entire bitmap. 
Regions can be comprised of any combination of rectangles, ellipses, rounded- 
rectangles, freehand shapes, polygons, transparent color and more. 


COLOR CONVERSION Depth conversion - I to 64 bit with 8 dithering methods, includes 
new support for 48 and 64 bit color. Color space conversion/separation - RGB, YUV, CMYK, 
CMY, YIQ, HSV & HLS. Grayscale, intensity, hue and saturation, contrast, negative, gamma 
correction, and histogram equalize. 


DISPLAY/SPECIAL EFFECTS Optimized rendering of images to display devices with 


control over contrast, brightness, gamma, dithering, panning, rotated view, scrolling. 
Zooming with bicubic, bilinear, resampling options and options to fit/fit-width/stretch. Pan 
Window for easy navigation, Image List and Thumbnail Browser for viewing collections of 
images. Render images using regions and transparency. Incorporate 3-D shapes and 
text with 3-D and angled rendering options. 2000+ special effects and transitions with 
control over delay, passes and granularity. 


ANNOTATIONS Morethan50 objects including text, highlights, sticky notes, freehand, 
blackout, polygons, buttons, cross product, point, protractor, audio/video, push pin text, 
freehand hotspots and 26 predefined rubber stamps, all with security passwords and 
hyperlinks. Customizable display properties, pop-up menus, and toolbar. 


MEDICAL Complete support for DICOM 3.0, all modalities (CR, CT, MR, NM, US, RF. SC, VL, 
etc), 9-16 bit grayscale image processing/display/window leveling. Read, Edit, Insert, 
Remove Data Elements. Validate Data Sets. High-level support for creating and editing 
all |OD classes at file, module and element level. Read/write Multi-Part files and overlays. 
Complete DICOM Basic Directory support (read/write). Communication: Support for 
TCP/IP multiple clients/servers, Asynchronous/Synchonous modes, Message Exchange 
including DIMSE service user and provider, network communication support for Message 
Exchange Upper Layer Protocol, support for all standard Service Classes (Verification, 
Storage, Query/Retrieve, Study Content, Patient Management, etc.) 


*License required from Unisys for formats using LZW compression. LEAD and LEADTOOLS are registered trademarks of LEAD 
Technologies, Inc. ISIS® is a registered trademark of Pixel Translations, a division of Input Software, Inc. All other product names are 
trademarks of their respective owners. 





INTERNET/INTRANET = Features Net Aware ActiveX with ASP support, & Upgraded 
Netscape plug-in. Load images from any URL. Supports Progressive JPEG / CMP and 
special GIF* flavors. Send/Receive video, audio and binary data. 


AUDIO/VIDEO Play/edit multimedia objects. Play/Stop/Pause, set balance, volume, play 
rate and time format. Edit existing multimedia objects, or create new ones. Copy, 
delete, insert, paste, replace frames. AVI, MPEG, MIDI, WAV, and more. Support for 
DirectShow v6 and ability to create (write) MPEG files. 


WECTOR 2-D, 3-D vector imaging, load, view and save in vector format. DXF, EMF, WMF, 
DRW, CGM, DGN and HPGL. Scaling from 1-1000%. 3-D objects can be rotated, 
translated and scaled, and view perspectives can be changed at runtime. Optimized 
drawing code can use OpenGL, DirectX or GDI functions, with point, wireframe or 
illuminated rendering. Render to over 60 raster formats. 


OCR/BARCODE Auto zone detection, training (learning), multi-language and multi-font 
support, lexicons, interactive text verification, export to a variety of text formats. Read 
and write Linear (1D), CodeOne (2D) and PDF417 (2D) barcodes with support for more 
than 17 sub types. 

SCAN/CAPTURE ISIS, TWAIN, scanners, digital cameras, capture cards. 


DATABASE Databinding, OLE DB, JET, ODBC, Oracle, SQL, BLOB. 


COMPRESSION spec, JBiG, CCITT G3/G4, RLE, LZW, PNG, CMP. more... 
PRINTING Halftone, contrast/gamma, regions, mixed text & images. 


SCREEN CAPTURE Full screen, active window, menu, object, area, timed multi- 
capture, resources from EXEs or DLLs. 


IMAGING COMMON DIALOG Imaging specific dialogs with image preview for a 


consistent, professional interface. 





800-637-1837 
www.leadtools.com 


LEADTOOLS i is available in several versions, while hot every feature is available in Y] —- Sf i NVOVLOYSTs®S 
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every version, you can easily find the toolkit to match your needs by visiting our website. 
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(continued from page 105) 
which people take such pains to disguise 
their native scents, I find that result a lit- 
tle hard to believe, and that makes me 
wonder a little about all her work. 

But just a little. The nose definitely has 
a rather direct path to our memories and 
our emotions, and we 
have never before had 
the ability to fully ex- 
ploit scent as a com- 
munication medium. 
We don’t know what 
the results might be. 


Freedom of Stench 
It won't all be roses, 
I’m sure. 

Unlike sight or 
sound, smells linger; 
and like sound, smells 
can’t be easily con- 
tained. The developers 
have already run up 
against the perma- 
nence issue: Before 
you can appreciate the 
next smell, you some- 
how have to get rid of 
the previous one. Per- 
fume vendors have 
come up with the trick of keeping bowls 
of coffee beans around to sniff. Apparent- 
ly coffee sort of neutralizes the effect of 
perfumes. Whether that helps the users of 
the iSmell isn’t clear, since they will be ex- 
periencing a wider range of smells that any 
perfume-counter sniffer would, but at least 
it gives me a Java reference for this item. 
An effective implementation apparently re- 
quires the use of Java beans. No word 
whether that has to be 100 percent pure 
Java, or if it could be any aroma starting 
with the letter “J.” 

But now the really bad news: In a 
world of computer viruses, e-mail scams, 
and spam attacks, we have to anticipate 
the offensive use of the technology. There 
will doubtless be stinkbomb e-mails, ol- 
factorily sabotaged web sites, and smell 
viruses. Less threatening but more 
widespread will be the effect of bad taste. 
Think of the early days of desktop pub- 
lishing, the days of ransom-note design. 
Some folks don’t have good taste, and 
some folks have less sensitive noses than 
others, too. Porn sites will surely use 
scent to good advantage— or bad, de- 
pending on your perspective. I’d just as 
soon drop that line of thought, except 
that wherever pornography rears its head, 
sticky legal issues ensue. 

I wouldn’t be surprised to see digital 
scents being banned in many workplaces, 
and in public places like libraries. This 
could create some interesting legal chal- 
lenges. There is no specifically established 
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right of free stench, akin to the right of 
free speech, but free speech has been in- 
terpreted to mean free expression, so an 
argument for freedom to smell could be 
made. That would immediately raise oth- 
er questions: Are there smells that could 
be judged obscene, as opposed to mere- 
ly suggestive, prurient, 
obnoxious, or offen- 
sive? Are there racist 
smells? Is it racist to 
suggest that there 
might be? This whole 
thing could be a polit- 
ical hot potato. Or 
make that onion. 

Another legal issue 
that has some prece- 
dent: scents as intel- 
lectual property. Sun 
Microsystems might ac- 
tually want to protect 
a particular coffee 
smell as the official 100 
percent Java (TM) aro- 
ma, granting the right 
to use it only on prod- 
ucts or web sites using 
approved Java tech- 
nology. 


It’s No Java 

I never told you, a couple of months ago 
when I was writing about the free Rebol 
programming language, that there was a 
commercial version in the works. 

It’s out now. Carl Sassenrath and his 
merry band of Rebollers have released 
REBOL/Command 1.0, an enhanced ver- 
sion with features designed for e-com- 
merce application development, includ- 
ing support for ODBC and for calling 
third-party applications, and the abili- 
ty to call platform-specific system or 
shell commands or DLLs from within RE- 
BOL scripts. As the reference to DLLs and 
shell scripts suggests, REBOL/Command 
will initially be available for Windows 
and popular UNIX versions, including 
Linux. 

The original REBOL/Core is still avail- 
able as a free download and still being 
enhanced and maintained. REBOL/Com- 
mand is a particular package for a partic- 
ular market. More such packages can be 
expected to come later. One news report 
on the announcement got a little carried 
away, characterizing this as the language 
that might succeed where Java has failed. 
Hmph. I’m a fan of Rebol, but that’s a bit 
over the top. I won't itemize the many 
tools that Java has and Rebol lacks, since 
it could be argued that this is just a gen- 
eration gap. Rebol is still very young. But 
even if you agree that Java has failed, a 
claim that would demand clarification in 
any case, I think the failure would be Sun’s 
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failure to carry through on the promise of 
“write once, run anywhere.” That’s most- 
ly a failure of Sun itself to control Microsoft 
and to strike a workable balance between 
its desire to control the language and the 
concerns of standards organizations. It’s 
not a technical issue. Rebol’s approach 
to “write once, run anywhere” is for the 
Rebol staff to port the language to every 
platform they can think of. They’ve done 
a great job with REBOL/Core, but for ev- 
idence that the approach has its limits, 
see the platform requirements for that 
Rebol/Command release. The Mac isn’t 
even on the to-do list, so far as I can tell, 
and some other platforms are likely to 
take a while. 

I’m a fan of Rebol. It’s an interesting, 
powerful, easy-to-learn language that 
manages to fit a significant bundle of ca- 
pabilities into a remarkably small foot- 
print. It makes it easy to write simple In- 
ternet applications — easier, I think, than 
any other tool has managed. It may have 
a spectacular future, but it is not today a 
replacement for Java, and won’t be any 
time soon. 


Error of Prediction? 

I never claimed to be a mathematician, 
but I have represented myself as not be- 
ing an enumerate boob, so it was in- 
evitable that I’'d get the math wrong. Any- 
way Lloyd Rice thinks I did. Lloyd wrote 
to point out that the future is not asymp- 
totic. Most of the compelling evidence for 
the increasing pace of advancement in 
technology (see my recent columns on 
Stewart Brand’s The Clock of the Long Now 
and James Gleick’s Faster) is the kind of 
change regarded as generally following 
Moore’s Law. But Moore’s Law has no 
asymptote: If CPU performance doubles 
every 18 months, it just keeps on dou- 
bling. “If it's 100 MIPS this year, then it’s 
10 to the 32nd MIPS a century from now,” 
Lloyd says. “No problem.” Actually I think 
he’s assuming a 12-month doubling, but 
the idea’s right. Of course it won't hap- 
pen; Moore’s Law is already breaking 
down. Or is it? There is a growing sense 
that Moore’s Law taps into some prop- 
erty of technology as a whole, and that 
the breakthroughs will come when the 
limits of particular technologies are 
reached. This sort of naive faith in tech- 
nology seems—well, naive, but it also 
seems entirely consistent with recent ex- 
perience. 

Still, even if technological trends contin- 
ue to ramp up exponentially, that doesn’t 
lead to an asymptote. But the argument is 
that there’s more afoot than Moore’s Law, 
or rather, than whatever forces lie behind 
the useful observational extrapolation that 
is Moore’s Law. Gleick and Brand report 
on the views of assorted more or less as- 
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tute observers, many of whom see 
progress going kerflooey in a few years. 
There is some sloppiness in the way some 
of these people talk about the data, it’s 
true. But the notion of an asymptote, a 
singularity, comes from plotting lots of 
trends, I think, and observing where the 
best-fitting curve seems to go. The best- 
fitting curve isn’t, apparently, a Moore’s 
Law- obeying exponential curve. At some 
time in this century the curve really does 
seem to go vertical. Progress becomes in- 
finite. So say the curves. 


As a video game 
machine peripheral 
iSmell could give a 

new dimension of 

reality to games 





What that means is hard to say. I think 
it’s safe to say it doesn’t mean processors 
will have infinite speed in this century. I 
suspect it just means that we are reaching 
the end of our ability to predict the fu- 
ture. But that’s plenty for it to mean. 


Put a Gizmo in that Gadget 
I never sold you an Edsel, but I did ded- 
icate some precious DDJ pages a couple 
of years ago to sharing my experiences in 
becoming a Newton software developer. 
Who can forget my mushroom identifier 
program? If I had only waited until Digi- 
Scents came out with their iSmell device, 
I could have incorporated the all-impor- 
tant dimension of smell into the program. 
Of course, iSmell doesn’t work with a 
Newton MessagePad, but these days who 
does? Not I. I have joined the rest of the 
human race, leaving the company of fa- 
natics. Well, leaving that particular group 
of fanatics anyway. I don’t give up easily, 
as the abused deceased quadrupeds and 
currency-stuffed rodent holes around here 
attest, but I do eventually get the picture. 
So I’m not actually going to exhort you 
to become a Visor Springboard develop- 
er in your spare time. In fact, maybe you 
shouldn’t. But you might be tempted. Let’s 
consider the pros and cons. The Palm 
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handheld device platform has been a run- 
away success, and a lot of people have 
written little programs for the Palm. Some 
people have made money, but there are 
complaints that Palm and other compa- 
nies have been giving away Palm software 
so freely that they are making it hard to 
make a buck selling the stuff. Now Palm 
is espousing some sort of services mod- 
el, which doesn’t seem particularly help- 
ful to third-party developers, either. 

When the Palm founders went off to start 
their own company, licensing the Palm OS 
and coming out with a cheaper, better de- 
vice, it looked like here was the boffo plat- 
form, especially with those add-in mod- 
ules. But the lower price of the Visor can 
reasonably be expected to lower consumer 
expectations about what they should be 
asked to pay for software and peripherals, 
once again making it tough to make a buck 
doing third-party products. 

Still, a tiny piece of a big pie is sweet. 
At last fall’s Comdex trade show, the 
Handspring’s booth was swamped. Voic- 
es in the crowd were heard comparing 
Handspring’s Visor with the Palm device 
(a Wired news reporter caught this: “This 
is thinner, and you can upgrade and get 
more memory without having to open it... 
the software is easily transferable; the ex- 
pansion slot is the thing, plus, it’s a nice 
funky color”), and with Windows CE de- 
vices (“It takes more power and guts to 
support CE...it will put Windows CE out 
of business”). 

The development opportunity is in those 
Springboard modules. Visor’s Springboard 
feature is the first real plug-and-play ca- 
pability for handhelds. Just plug a device 
into the back and the device’s capabilities 
are immediately displayed on the Visor’s 
screen. No drivers to install; plug in a cel- 
lular phone device and Visor becomes a 
cellular phone, plug in a GPS module and 
the Visor is a GPS device. The develop- 
er’s kit is freely downloadable from the 
Visor site (http://www.handspring.com/), 
the Visor folks seem developer friendly, 
and the development process is simple. 
Some people are going to make a lot of 
money developing Visor modules. So it 
seems to me. 

Reports are that Palm hasn’t been quite 
so helpful to developers. Nevertheless, 
there is a PalmOS emulator in Java that 
I feel I must mention. You can read about 
it at http://www.javaworld.com/javaworld/ 
jw-11-1999/jw-11-device.html. Whether 
it’s Visor, Palm, or Windows CE, I see a 
lot of software out there now for hand- 
helds. Hmm, there’s a fantastic opportu- 
nity. I could port it all to the Newton! 
And publish the code here! Only kidding. 


DDJ 
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e Formula One has a lightweight footprint. System 
requirements for Excel 2000 include 146 MB of hard disk 
space. A Formula One JAR file is approximately 1 MB. 


e Formula One writes files optimized for the Web. 
Formula One’s file compression technology is built with 
distributed computing in mind, enabling it to save files up to 
90% smaller than Excel’s in some cases. Excel's overhead 
caused by its dependence on Office and Windows produces 
files not suited for Web-based computing. 


e Formula One is 100% Pure Java and only requires 
the proper virtual machine to function. Excel 
requires Windows to operate. 


Let's face it, you can wish all you want but Excel is still a heavy Windows-based, 
desktop-bound application—not your ideal choice for building online solutions. 


Formula One, however, is an Excel-compatible spreadsheet toolset designed 
for distributed computing. 
charting abilities, combined with point-and-click access available 


Its robust calculating, formatting, and 


anywhere via a browser, make Formula One the most powerful 
Spreadsheet technology available for integration into your Web 
deployed applications. 


So whether you need database reporting, analysis, 
calculations, data presentation, or data entry, Formula One 
has the following advantages: 


e Formula One is built in the Java programming 
language and is ideal for widespread distribution 
on the Web. Excel and Office Web Components are 
designed for use behind a firewall and require Microsoft 
Office to be installed on all desktops. 


e Formula One’s architecture and JDBC methods 
enable it to be used with a wide variety of 
database and application servers. Excel can't. 


e Formula One provides a JavaBean and applet with 
an API of more than 400 properties, methods, and 
events. Excel is not an API-driven application and can not 
be used as acomponentin a Java application. 


FORMULA ONE 7.0 


| FORMULA ONE 


@ | sl ¢ Charting 





Copyright © 1999 Tidestone Technologies, Inc. Allrights reserved. Tidestone, the Tidestone logo, and Formula One, are trademarks 
of Tidestone Technologies, Inc. Java, 100% Pure Java, and all Java-based trademarks and logos are trademarks or registered 
trademarks of Sun Microsystems, Inc. in the U.S. and other countries. All other trademarks are property of their respective owners. 
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C PROGRAMMING 


The S Programming Language 


Al Stevens 


oday is Judy’s birthday. My brother 

Walt is visiting from his home in Ja- 

maica, and I suggested we all go out 

for dinner and some bar-hopping to 
celebrate. He seemed surprised and 
asked wouldn’t I prefer instead to be 
alone with Judy on this special night for 
“a quiet romantic evening together.” I 
looked at him incredulously and, after a 
pause, blurted out, “But she’s 58!” She 
was coming around the corner out of 
the kitchen at the time and heard the 
whole thing. It’s been kind of quiet 
around here since then. This is going to 
cost me. 


The Editor and its Scripts 

In January of last year, I introduced a pro- 
grammer’s editor project (named unimagi- 
natively “Editor”), which is meant to be- 
come the integrated editor in Quincy 99. 
Quincy is a Win32 integrated development 
environment that supports development of 
GUI and console Win32 applications. Quin- 
cy uses gcc-mingw32, an open source port 
of the gnu C/C++ compiler suite that runs 
under Win32 and supports calls to the 
Win32 API. You can find Quincy 99 and 
instructions for where to find the compil- 
er at http://www.midifitz.com/alstevens/ 
quincy99/ and from DDJ. 

When I integrate the Editor program 
into Quincy 99, I'll also do a major facelift 
to the development suite including a Stan- 
dard C++ library, improved debugging fea- 
tures, and better project management; 
Quincy 99 then becomes Quincy 2000. 
That is the plan. I’m still waiting for the 
gcc volunteers to finish the Standard li- 
brary, which should be soon. I have a pre- 
liminary version of it and it looks good so 
far. In the meantime, I continue to work 
on the Editor. I first discussed that project 
here about a year ago. It’s currently avail- 
able as a standalone program, and I en- 
courage everyone to download it, use it, 
and send comments to me about it. You 


Al is a DDJ contributing editor. He can be 
contacted at astevens@ddj.com. 
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will find the Editor at http://www.midifitz 
.com/alstevens/editor/. 


User-Defined Improvements 

I tested Quincy 99 with a substantial num- 
ber of programmers downloading and test- 
ing it, and, with their generous help, made 
many improvements and corrections to 
the program. 

Teachers and students around the world 
are using Quincy because it is a free 
Win32 development platform and because 
it resembles a high-end IDE. Several of 
these users have offered suggestions to 
improve Quincy’s performance in the 
classroom environment, which often in- 
volves networks. Although its original pur- 
pose was to support a single user in a 
C++ self-teaching situation, these students 
and teachers convinced me to make Quin- 
cy work in a network, and give users more 
control over where to find source files and 
where the compiler writes compiled files. 

An oft-repeated request was to have the 
editor highlight syntax— keywords and 
comments — with different text colors. To 
that purpose, I added syntax highlighting 
to the Editor program, and then I worried 
about it. The Editor is an exercise in us- 
ing Standard C++ containers, iterators, and 
algorithms to implement text editing and 
template abstractions to implement se- 
lected block marking and undo/redo. I re- 
fer you to the columns in the beginning 
months of 1999 for discussions of those 
features. 

The most elegant programming solu- 
tions are not always the most efficient, 
unless of course, efficiency is your sole 
measure of elegance. In my view, elegant 
code is reliable, reusable, maintainable, 
extensible, and, most importantly, read- 
able. After that, it can be efficient if pos- 
sible. Squeezing that last nanosecond out 
of a tightly written algorithm is pointless 
if the optimized code has become too ab- 
struse for anyone to understand. Having 
written editors in the past, I understand 
that maintaining a text buffer, rendering 
the text on the screen, and keeping the 
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data and its visual representation in sync 
can be a time-intensive process. When I 
wrote text editors for the machines of 
yore, I employed cryptic optimizations to 
minimize the cycles consumed by those 
operations. I worried that adding code to 
the Editor to scan each displayed line of 
text for keywords and comments and 
changing colors accordingly might add 
too much overhead and make scrolling 
and paging look jerky. Setting aside my 
concerns, I got text highlighting working 
on my trusty old P300. Then, says I, what 
is the least amount of hardware that any- 
one ought to expect this Editor and Quin- 
cy to run on? The slowest thing I have ca- 
pable of running Windows 98 is a P120 
laptop, which, in this day of 700-Mhz ma- 
chines, is really low end. I figured if the 
Editor works on that old laptop, it should 
be acceptable for any target suitable for 
Win32 development. (This attitude is un- 
like the Bill Gates model of software de- 
velopment, which is: By the time you get 
it ready to ship, the hardware will have 
caught up.) The Editor with syntax high- 
lighting works fine on that old P120. 


A Resurrected Script 

Never mind going back only one year to 
revisit a project, how about going back 10 
years? In the May 1989 issue, I described 
a homebrew C variant named “S,” which 
is a script language I designed for appli- 
cations that need scripts. This was in the 
days before Javascript and VBA. I imple- 
mented an S interpreter and provided a 
shell program that used the interpreter to 
run source code programs from the com- 
mand line. I designed the interpreter to 
be reusable; an application would provide 
a shell process and some intrinsic func- 
tions that users of scripts could call. The 
following month, I integrated the inter- 
preter into a communications program 
named “Smallcom,” which was an ongo- 
ing project for the column at that time. 
The application and the interpreter were 
written in C and ran in MS-DOS text mode 
from the command line. 
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The Editor program needs a scripting 
tool to implement complex macros. It al- 
ready has a simple keystroke macro 
recorder, which works well for repetitive 
editing tasks but cannot implement fea- 
tures such as smart indenting, brace match- 
ing, and such. Originally, I put hooks into 
the Editor so that users, who are pre- 
sumably C++ programmers, can imple- 
ment such features by extending the pro- 
gram in source code and recompiling it; 
the script language was C++ itself. Some- 
thing about that approach stuck in my 
craw. Real program editors have real script 
facilities. 

I retrieved the old S interpreter code to 
see if it could serve as a script interpreter 
for the Editor program. It looked as though 
it could, but I wrote it in C and used many 
archaic C idioms that contemporary C++ 
programmers, or at least this C++ pro- 
grammer, find cumbersome. So I rewrote 
the interpreter in C++, and you can down- 
load it. The package includes a shell pro- 
gram that tests the interpreter by loading 
and executing text source code files writ- 
ten in the S language; see “Resource Cen- 
ter,” page 7. I haven’t integrated the in- 
terpreter into the Editor, yet, but that’s 
next. If you have any of the Dr. Dobb’s 
CD releases, you might want to compare 
this month’s C++ version with the C ver- 
sion from 10 years ago. 


The S Programming Language 

S is a small variant of C that implements 
functions, local and global variables, literal 
constants, and three data types: char, int, 
and string. S supports for, while, if, and else. 
It has no preprocessor. An S script, like a 
C program, must have a main function to 
get things started. The shell that drives the 
interpreter provides intrinsic functions that 
S scripts can call. These functions may re- 
turn any of the data types and may accept 
any of the data types as arguments. 

When S becomes the Editor’s script lan- 
guage, the Editor will provide intrinsic 
functions that return a string of text from 
a specified line in the text buffer, return 
the current insertion cursor position, re- 
turn the range of a selected block, posi- 
tion the insertion cursor, insert text into 
and delete text from the buffer, do search 
and replace operations, support generic 
dialog box data entry, and anything else 
that I need when I start writing scripts. 

Listing One is si.cpp, the shell program 
that exercises the interpreter. It demon- 
strates what a shell process must provide 
to use the interpreter. The shell has to pro- 
vide two functions that the interpreter uses 
to get script source code characters to in- 
terpret. Those functions are named gef- 
source and ungetsource, and the shell’s ver- 
sion of them, which simply call standard 
C’s geic and ungetc, are at the bottom of 
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si.cpp. They call the C functions to get char- 
acters from the script file that the shell 
opens from a command-line argument. 
The shell also provides a table of Jn- 
trinsic objects that describe the shell’s in- 
trinsic functions to the interpreter. The ex- 
ample shell in si.cpp provides four intrinsic 
functions—printf, getchar, putchar, and 
getversion— to illustrate how the inter- 
preter interacts with intrinsic functions. 
The table provides a string with the func- 
tion’s return type and name and the ad- 
dress of the shell function to call. The casts 
in the intrinsic table initializers coerce the 
functions into having the same signature 
for purposes of initializing the array. 
The interpreter passes all arguments to 
intrinsic functions as a pointer to an array 
of int variables. For string arguments the 
corresponding entry in the array is really 
a pointer to a null-terminated character 


array. Arguments of type int and char are 


passed as ints. When an intrinsic function 
returns a string to the interpreter, as the 
getversion function does, it passes a char* 
value. The interpreter copies the string 
text into its own memory, so the intrinsic 
function’s copy can be safely discarded. 

As the interpreter compiles the source 
code into bytecode and, later as it inter- 
prets the source code, it does some syn- 
tax checking. If it finds an error, the inter- 
preter throws an exception of type 
SlException with a code that identifies what 
is wrong, a (possibly empty) std::string ob- 
ject with some text that expands on the er- 
ror, and the line number in the source code 
where the error was found. The example 
shell translates the error into a message to 
display on the console. 

Listing Two is interp.h, the header file 
that an application shell includes to use the 
interpreter. The application instantiates an 
object of type S/nterpreter, with the address 
of the array of Intrinsic objects as an ini- 
tializer. The application then calls the in- 
terpreter’s interpret function, which reads 
the source code and runs the script. 

Observe that both si.cpp and interp.h 
have namespace statements that are com- 
mented out. This is due to a bug in the Vi- 
sual C++ 5.0 STL container templates or in 
the compiler, I don’t know which. Instan- 
tiating STL containers parameterized on 
types that have scope qualifiers causes the 
compiler to issue an error that the type 
name is not known. I can uncomment the 
statements, and the gcc compiler compiles 
the programs without error. For the same 
reason, interp.h declares the Intrinsic, To- 
ken, Datum, and Symbol classes outside the 
SInterpreter class. Those classes really ought 
to be inside S/nierpreter or in a namespace, 
but VC++ 5.0 won't permit it. Because Ed- 
itor is still an MFC application, I have to use 
VC++ to compile it, and I must keep the 
interpreter compatible with VC++. 
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The interpreter is implemented in in- 
terp.cpp (available electronically). It uses 
typical interpreter logic beginning with a 
lexical scan that converts the source code 
into bytecode. Then it interprets the byte- 
code to run the program with a recursive 
descent parser. Just now the bytecode re- 
tains the text identifiers and searches the 
symbol table every time it encounters an 
identifier to resolve and dereference. I hope 
later to replace that logic by building a table 
of identifiers and using offset tokens in the 
bytecode instead of string identifiers. I also 
plan to replace the interpreting recursive 
descent parser by compiling expressions 
into a postfix stack architecture to improve 
performance. Eventually I can differentiate 
between source-code scripts and compiled 
scripts. Then maybe a debugger. Too much 
code, too little time. 


Abstraction With #define 

Some C++ programmers don’t like the pre- 
processor— and for good reason. The lan- 
guage keeps taking on features that don’t 
get along well with the preprocessor. An 
example is namespaces. The preprocessor 
does its #define translations without con- 
sidering namespaces. Consequently, any 
macro a program #defines, irrespective of 
namespaces, has the potential for colliding 
with things that are otherwise properly pro- 
tected by namespaces. That’s why they tell 
you not to use underscore prefixes on your 
identifiers, particularly with your #define 
macros. Identifiers that begin with one un- 
derscore followed by an uppercase letter 
or with two underscores are reserved for 
the language implementation, and if your 
macro should happen to collide with some- 
thing in a standard header, it could cause 
all kinds of trouble. 

Unlike many of my colleagues, I kind 
of like #define. Observe how I used it in 
interp.h to define an abstraction of the — 
overloaded operators for the Datum class. 
I can hear the horrified gasps of disap- 
proval already. Abstraction with #define? 
How simply awful! 

The Datum class represents objects of 
the S language types. Each Datum object 
is either a string, char, or int. The class 
overloads the S language operators so the 
interpreter can perform those operations 
on objects of the types declared in the 
scripts. The code to overload most of the 
arithmetic operators is the same with the 
exception of the operator itself. Same with 
the relational operators and the unary op- 
erators. The only mechanism in C++ for 
passing an operator as an argument to a 
function is provided by the function-like 
macros of the #define preprocessor di- 
rective. There is no other way to do it. I 
wrote UNARY, LOGICAL, ARITHMETIC, 
and RELATIONAL macros to form ab- 
stractions of reusable code for overloaded 
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operators. Observe those macros and how _ -= arithmetic operator formats, for exam- If that kind of programming offends 
the class calls them. Later, if I want to add _ ple), I simply add another macro call state- some sensibilities, so be it. I think it’s 
more operators to S (it doesn’t support ment to the class and put the code in the kind of elegant, myself. 

bitwise logical operators, or the += and _ interpreter to use the operator. DDJ 





Listing One 


#include <stdio.h> 
#include “interp.h" 


[| ----- intrinsic functions 

int iprntf(int* p) // printf 

printf (reinterpret_cast<char*>(p[0]),p[1],p[2],p([3],p[4]); 
return Q; 

- igtch() // getchar 

return getchar(); 

= iptch(int* c) // putchar 

return putchar(*c) ; 

es getver () // return a string 

return "Version 1.9"; 


Intrinsic funcs[] = { 
Intrinsic("int printf", reinterpret_cast<ifunc>(iprntf)), 

Intrinsic("int getchar", reinterpret_cast<ifunc>(igtch) ), 

Intrinsic("int putchar", reinterpret_cast<ifunc>(iptch) ), 


Intrinsic("string getversion",reinterpret_cast<ifunc>(getver)), 


. Intrinsic("", 0) 

¥3 

|[ ---------- error messages 

char *erm[]={ "Unexpected end of file", "Unrecognized", 
"Duplicate ident", "Undeclared ident", 
"Syntax Error", "Unmatched {}", 
"Unmatched ()", "Missing", 
"Not a function", "Misplaced break", 
"Out of place", "Not an identifer", 
"Mismatched arguments", "Divide by zero", 
"Invalid constant", "No main function" 

}: 


static FILE *fp; 
int main(int argc, char *argv[]) 


{ 
if (arge == 2) [ 
if ((fp = fopen(argv[1], "r")) != @) { 
try { 
SInterpreter si(funcs) ; 
si.interpret(); 
} 
catch (SIException sex) { 
printf("\n%s %s on line %d\n",erm[sex.ercode] , 
sex.msg.c_str(), sex.lineno) ; 
} 
fclose(fp) ; 
} 
} 
return @; 
} 
|/ ----- functions that the interpreter requires 
int getsource(void) { return getc(fp); } 
void ungetsource(int c) {  ungetc(c, fp); } 
° ® 
Listing Two 
|| ---------------- interp.h -=-<=<-=-=--<~<=<-=-- 


#include <vector> 
#include <string> 


// namespace DDJScriptInterpreter { 


// ----------- error codes 

enum errs { EARLYEOF, UNRECOGNIZED, 
DUPL_DECLARE , UNDECLARED , 
SYNTAX, BRACERR, 
PARENERR, MISSING, 
NOTFUNC, BREAKERR , 
OUTOFPLACE, NOTIDENT, 
MISMATCHEDARG, DIVIDEERR, 


INVALIDCONSTANT, NOMAIN IP 
class SIException { 
public: 
errs ercode; 
int lineno; - 
std::string msg; 


SIException(errs er = SYNTAX, int lno = 9, std::string m = std::string()) : 


ercode(er), lineno(1lno), msg(m) 
{ } 

J; 
typedef int (*ifunc) (void*) ; 
// --- intrinsic function table (provided by shell application) 
class Intrinsic { 
public: 

std::string signature; 

ifunc fn; 

Intrinsic(const std::string& sig = std::string(), ifunc f = @) 
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signature(sig), fn(f) 
{ } 
be 
typedef short int token; 
enum DatumType { unknown, number, strng }; 


#define UNARY (op) \ 
Datum operator op () const \ 
{ \ 
nostring(); \ 
return Datum(op value) ; .N 
} 
#define RELATIONAL (op) \ 
bool operator op (const Datum& d) const \ 
{ \ 
sametype(d) ; \ 
return (type == strng) ? (strval op d.strval) : (value op d.value); \ 
} 
#define ARITHMETIC (op) \ 
Datum operator op (const Datum& d) const \ 
{ \ 
nostring(); \ 
d.nostring(); \ 
return Datum(value op d.value) ; \ 
} 
#define LOGICAL (op) \ 
bool operator op (const Datum& d) const \ 
{ \ 
nostring() ; \ 
d.nostring(); \ 
return value op d.value; \ 
} 


class Datum { 
void nostring() const 
{ if (type == strng) throw SIException(); } 
void sametype(const Datum& d) const 
{ if (type != d.type) throw SIException(); } 
public: 
DatumType type; 
int value; // number value 
std::string strval; // string value 
Datum() : type(unknown), value(@) 


explicit Datum(int val) : type(number), value(val) 


{ } 
explicit Datum(std::string str) : type(strng), value(@), strval(str) 
{ } 


Datum& operator=(const Datum& d) 
{ type = d.type; value = d.value; strval = d.strval; return *this; } 
Datum operatort+(const Datum& d) const 
{ 
sametype(d) ; 
if (type == strng) 
return Datum(strval + d.strval); // concatenate strings 
return Datum(value + d.value) ; // sum numbers 
} 
bool operator! () const 
{ 
nostring(); 
return !value; 
} 
UNARY (-) 
ARITHMETIC (*) 
ARITHMETIC (/) 
ARITHMETIC (-) 
RELATIONAL (<=) 
RELATIONAL (>=) 
RELATIONAL ( ! =) 
RELATIONAL (==): 
RELATIONAL (<) 
RELATIONAL (>) 
LOGICAL (&&) 
LOGICAL (} | 
}; 
class Token { 
public: 
token tok; 
Datum datum; 
int tokennumber ; 
Token(token t = @) : tok(t) 
{ Jj 
bool operator< (const Token& t) const 
{ return tok < t.tok; } 
bool operator==(const Token& t) const 
{ return tok == t.tok; } 
Token& operator=(const Token& t) 
{ tok = t.tok; datum = t.datum; return *this; } 
ae 
typedef std::vector<Token> token_buffer; 
typedef token_buffer::iterator token_iter; 
enum SymbolType { none, variable, ifunction, pfunction }; 
class Symbol { 
public: 
SymbolType type; 
std::string name; 
Datum datum; 
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int entry; // subscript to function's first entry in token buffer bool findsymbol(int& ndx, const std::string& name, int fromscope = @); 
ifunc fn; // points to intrinsic function void compound_statement(int scope) ; 
Symbol(SymbolType ty = none, const std::string nm = std::string() ) : void statement (); 

type(ty), name(nm), entry(@), fn(@) void outofscope(); 


void statements(); 


bool operator< st Symbol& s st 
3 pnts a meee void skip_statements() ; 


{ return name < s.name; } 
bool operator==(const Symbol& s) const bool istoken(token tkn); 

{ return name == s.name; } void skippair(token 1ltkn, token rtkn) ; 
Symbol& operator=(const Symbol& s) 


{ type = s.type; name = s.name; datum = s.datum; Datum primary() ; 


entry = s.entry; fn = s.fn; return *this; } Datum mult (); 
}; Datum plus(); 
typedef std::vector<Symbol> symbol_table; Datum le(); 
typedef symbol_table::iterator symbol_iter; Datum eq(); 
class SInterpreter { Datum and() ; 
token_iter tokiter; // iterates the token buffer during interpreting Datum expression(); 
oe bool isidentchar (int c) 
class Keyword { ( 
public: : ta ae tl te Ht 
sede seteing don ; return isalpha(c) |; isdigit(c) ii c¢ ats 
Token: Hutoken bool iswhite(int c) 
Keyword(const char* k, Token tk) : kw(k), kwtoken(tk) ( 
}: oot return c == ' ' {1 ¢ == '\t!; 
symbol_table symboltable; a . 
token_buffer tokens; a aid ne ; 
: — a ‘ explicit SInterpreter(const Intrinsic* inf) ; 
int currentscope; // index of first symbol table entry for current scope dun dnceepree 
static token tokentbl[{]; ): P 
static Keyword keywords 1; // } // namespace DDJScriptInterpreter 
Datum frtn; // return value from a function (7) eo Pnetouaceeoeidek by ee eel 
bool breaking, returning; i Aeeeneeat Pp y 
ree eRSppane, void ungetsource(int ch); 
int linenumber; 
bool scanned; // true when lexical scan is complete 
int LineNumber(); // current source file line number 


void initialize(); // initialize data variables 

// functions for lexical scan 

void lexicalscan() ; 

bool declarator(bool islocal, bool isparameter = false); 
void declarators(bool islocal) ; 

Token compilenextsourcetoken() ; 

int escseq(); 

int getsourcechar() ; 

int getrawsourcechar () ; 


// functions for compiling and interpreting program 
Token nexttoken() ; 

void prevtoken() ; 

Token needtoken(token tkn) ; 

Datum function(Symbol sym) ; 


Dont be affald to make a change! 








In software, as in life, the only Why redistribute the whole system Build program to compare old versions 
constant is change. But change every time you make a minor program of files or directories to their new 
doesn’t have to be painful! Why change? Now you can update files the versions and create a patch file of the 
replace an entire database if you have easy way, by distributing a small differences. Then use the Apply 
only updated a handful of records? “patch” file of just the changes with program to update the destination 

: Blinkinc DeltaPatch™ system, under Windows NT, 95/98, 3.1 





or DOS. Extensive error-checking 
ensures file and system integrity, so 
master the art of change without fear 


Programs, data, documents — this 
multipurpose tool does it all! 


Safely post a patch file on the with Blinkinc DeltaPatch! 

Internet, because it only updates your 

current program users - quickly, easily — Blinkinc DeltaPatch" 

and with no royalties. That makes Only $299! 

product improvements, beta testing Tel: 1-804-784-2087 

and bug-fixes a whole lot easier! Why | Fax:1-804-784-2357 
replace a huge database or document =‘ Download a demo today from 
when you can simply email everyone http.//www.blinkinc.com. 


a change file? You can even update 
entry-level users with self-applying 
patches, which automatically locate 
and update virtually any type of file. 


Blinkinc 


Just use the simple point-and-click 
P.O. Box 29858, Richmond, VA 23242-0858 





Blinkinc DeltaPatch is a trademark of ASM, Inc. All other trademarks acknowledged. Graphics copyright Blink, Inc 1992-2000. 


http://www.ddj.com Dr. Dobb’s Journal, February 2000 113 


New! 


Together’s EJBs, GoF patterns, and simultaneous round-trip engineering—what Home 


Depot calls “the backbone of all our software development” 
Download the free Whiteboard Edition and experience Together/J or Together/C++ for yourself 
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JAVA Q&A 





How Do You Plug 
Java Memory Leaks? 


Ethan Henry and Ed Lycklama 


ne of Java’s biggest selling points 

has been its supposed immunity to 

one of the most challenging pro- 

gramming problems— memory 
leaks. But some Java developers have ob- 
served their Java programs exhibit classic 
memory-leak behavior— unbounded 
memory growth leading to poor perfor- 
mance and eventually crashing. What's go- 
ing on? 

First, let’s look at how dynamic memo- 
ry management works in Java and under- 
stand what the garbage collector does. 
Objects are allocated on the heap using 
the new operator and accessed via refer- 
ences. Probably the easiest way to think 
about memory in Java is to picture the 
heap forming a directed graph, where ob- 
jects form the nodes and the references 
between objects make the edges. The 
garbage collector sees the memory this 
way, as a graph of objects and references. 

The purpose of the garbage collector is 
to remove from memory objects that are 
no longer needed. This is a hard problem 
to solve — the garbage collector can’t tell 
whether you need a particular object, so 
it uses an approximation and looks for 
objects that are no longer reachable. Us- 
ing the directed graph analogy, it looks 
for objects that can’t be reached by any 
path starting from a root. Roots, fixed 
places that are always guaranteed to ex- 
ist, are the starting points for the garbage 
collector. In Java, the roots include static 
fields in classes and locals on the stack. 
Anything that the garbage collector can’t 
reach from one of the program’s roots by 
any path is considered garbage. 

To illustrate this, look at Example 1 and 
Figure 1. The method has two local ref- 
erences on the stack, m1 and m2. There’s 
also a variable created outside the scope 
of this method called global. m1 and m2 


Ethan is Java Evangelist and Ed is chief 
technology officer for the KL Group. They 
can be contacted at egh@kigroup.com and 
el@kigroup.com, respectively. 
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are, temporarily at least, two roots for the 
garbage collector. Two objects are creat- 
ed and two references, or edges, are cre- 
ated to those objects (from the locals on 
the stack). Another reference is added 
from m1 to m2 and a reference is added 
from the global object. When the method 
returns, m1 and m2 are no longer on the 
stack, so the first object that was created 
is no longer reachable. Because the 
garbage collector can no longer reach that 
object by some path it will, at some point 
in the future, clean up that object. It’s im- 
portant to note that garbage collection 
does not happen immediately, but on a 
periodic basis. Even though the object will 
stay in memory for some period of time 
until the garbage collector releases it, it 
remains unreachable and can’t be reused. 

There are some common myths about 
garbage collection in Java that are worth 
cleaning up. The first one is that the 
garbage collector can’t handle cycles — it 
can. That is, if you have three objects — 
A, B, and C—with references from A to 
B, B to C, and C to A, and those are the 
only references to those objects, the 
garbage collector will clean those objects 
up. This is in contrast to other systems 
that use reference counting techniques 
(such as Microsoft's COM), which do have 
problems handling cycles in the object ref- 
erence graph. 

The second myth, and this is really for 
people who’ve moved to Java from C++, 
is that the finalizer is the same as a C++ 
destructor — it isn’t. There are a number 
of subtle differences, but the most im- 
portant one is that the finalizer is not guar- 
anteed to be called, unlike a destructor in 
C++, which is explicitly called in order to 
remove an object. You can’t reliably de- 
pend on the finalizer in Java. One inter- 
esting piece of trivia, however, is that if 
the finalizer is called, it’s possible for it to 
resurrect the object, by making a refer- 
ence to the object that’s about to be 
garbage collected from another object, 
thus making it reachable again. While this 
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is a bad thing to do in practice, the 
garbage collector is aware of the fact that 
it can, in theory, happen. 


Loiterers 

Now that we’ve talked about what the 
garbage collector is and what it does, let’s 
look at what it means to have a memory 
leak in Java. As Figure 2 illustrates, there 
are three states that an object can be in: 


e Allocated objects are all objects that have 
been created but not yet removed by the 
garbage collector. 

e Reachable objects are all the allocated 
objects that can be reached from one 
of the roots. 

e Live objects are reachable objects that 
are being actively used by your program. 


The garbage collector takes care of ob- 
jects that are allocated but unreachable. 
In contrast, these objects would be mem- 
ory leaks in C++, memory that’s perma- 
nently lost to the program. Tools like Ra- 
tional’s Purify and Numega’s BoundsChecker 
are designed to help track down this kind 
of problem in C++, finding objects that 
are allocated but no longer reachable. 

In Java, the situation is different. The 
garbage collector takes care of the allo- 
cated but unreachable objects for you, so 
a Java memory leak is instead an object 
that’s reachable but not live. Even though 
you have a reference to that object some- 
where and there’s a path to that object 
from some root, the object isn’t needed 
by the program and could be disposed 
of— if there wasn’t a reference to it. 

So one contrast between memory leaks 
in C++ and Java is that in C++ once you 
leak an object, the problem can’t be fixed 
by the program— there are no remaining 
references to that object. In Java, the ob- 
ject itself can be reached, but the code that 
manages the object may not be accessible 
to you; for example, the reference to the 
unneeded object might be from a private 
field in a class for which you don’t have 
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the source code. On the other hand, if the 
reference itself is accessible, then there 
should be some action the program can 
take to remove all the references to the ob- 
jects making it unreachable and eligible for 
garbage collection. 

Another difference, going back to the 
analogy of viewing the heap as a directed 
graph of objects and references, is that 
in C++ you have to manage both the 
nodes and the edges. Every time you add 
or remove objects or references, you’re 
changing the collection of both nodes 
and edges. If you leave some edges 
hanging, by freeing an object without re- 
moving all the pointers to that object, 
you get a dangling pointer, which usu- 
ally results in something like Windows’ 
infamous GPF error. Conversely, if you 
leave a node hanging, by removing all 
the pointers without removing the node, 
you have a memory leak. In Java, you 
can only do the second of these two 
things, removing the edges. Ultimately, 
you only have control over the refer- 
ences, so you have to think about man- 
aging just the edges. If you don’t remove 
references to objects, the garbage col- 
lector can’t remove them. You have to 
assist the garbage collector by managing 
the edges. 

One thing that we have found in in- 
vestigating memory leaks in Java is that 
they are rarer than they are in C++. In 
C++, it’s easy to get a memory leak by not 
writing destructors for classes or not both- 
ering to free memory on the heap. But in 
Java, the garbage collector does a lot of 
this work for you. The flip side to this is 
that the impact of memory leaks, the 






Example 1: A method with two local 
references on the stack, m1 and m2. 


Figure 1: A method with two local 
references on the stack, m1 and m2. 
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amount of memory that’s being lost, tends 
to be much greater in Java. 

The reason is that when you have an 
object that’s not being used any more, it’s 
rarely the case that there’s just a single 
object. That object will have references to 
other objects, which will have more ref- 
erences, and so on, forming a large sub- 
graph of objects that are leaked, just be- 
cause one reference wasn’t properly 
cleared. For example, Swing or AWT pro- 
gramming containers (such as panels or 
frames) include other child components 
(buttons, text fields, and the like). The 
container can reach all of its children as 
it has references to them (to lay them out). 
At the same time, each component has a 
reference back to its parent. There is, 
therefore, a path from every object in the 
user interface to every other object. Com- 
pounding the problem, UI objects are of- 
ten subclassed, adding additional refer- 
ences and objects into the subgraph. The 
result is that the memory leak is not just 
a small set of components, it can be a very 
large collection of objects that’s leaking. 

Since there are many distinct differences 
between memory leaks in C++ and Java, 
it’s confusing to use the same term to re- 
fer to both of them. Therefore, we refer 
to these unused objects in Java as “loiter- 
ers.” The dictionary definitions of a loi- 
terer are “to delay an activity with aimless 
idle stops and pauses” (which will hap- 
pen as the garbage collector has more and 
more objects to check on each pass) and 
“to remain in an area for no obvious rea- 
son” (you’re not using them, so why are 
they there?)— both fairly apt descriptions 
of what’s going on. Another good reason 
to use a different term is that the Java Vir- 
tual Machine and many of the libraries 
have native code in them, written in C++, 
and that code may have memory leaks in 
it, leading to confusion as to whether a 
leak is in Java code or C++ code that’s un- 
derneath the Java. 


Lexicon of Loiterers 
To further clarify and understand how loi- 
terers occur, we’ve identified four differ- 


ent patterns of loitering objects (and you 
may see a theme here): 


Lapsed Listeners. A lapsed listener is 
when an object is added to a collection 
but never removed. The most common 
example of this is an event listener, where 
the object is added to a listener list, but 
never removed once it is no longer need- 
ed. So the object’s usefulness has lapsed 
because although it’s still in the list, re- 
ceiving events, it no longer performs any 
useful function. One of the side effects of 
this is that the collection of listeners may 
be growing without bound. You can keep 
adding listeners to a collection, but they 
are never removed. This causes the pro- 
gram to slow down as events have to be 
propagated to more and more listener ob- 
jects, causing each event to take longer 
and longer to process. This is probably 
the most common memory-usage prob- 
lem in Java—Swing and AWT are very 
susceptible to this problem and it can oc- 
cur easily in any large framework. For ex- 
ample, see bug #4177795 in the Java De- 
veloper’s Connection (at http://developer 
java.sun.com/ developer/bugParade/index 
-html). In this case, instances of the 
javax.swing JInternalFrame class were loi- 
tering if a menu bar had been added to 
them. Through a long series of events, it 
turned out that the hashtable that keeps 
track of all keystrokes registered for menu 
shortcuts was holding onto a reference to 
the menu, which was holding onto the in- 
ternal frame, preventing any of these ob- 
jects from being garbage collected, even 
after all the references from inside the pro- 
gram were removed. It’s surprisingly easy 
to create this kind of problem. 

In contrast, this kind of problem rarely 
occurs in a C++ program. The memory 
would probably be freed without remov- 
ing the pointer from the list, creating a 
dangling pointer. When the program walks 
through the list and tries to dispatch the 
event via the bad pointer, the program 
would probably crash. Whether it’s better 
to leak memory or to crash is for you to 
decide. 


Figure 2: An object can be in one of three states. 
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Another example of a lapsed listener in 
Java 2 is a method on java.awt.Toolkit 


called addPropertyChangeListener(). You | Any appl ication. 


can register a listener there to receive no- 


tification whenever any desktop proper- Any platform. 


ties change, such as the resolution of the 
desktop. Because the Toolkit class is a Sin- The only solution” 
gleton, there’s only ever one instance of 
it that is created at the start of the appli- 
cation and survives for the lifetime of the 
entire application. Most listeners, howev- 
er, are going to have much shorter life 
spans. If you have a reference from some- 
thing that has a long life span to some- 
thing that has a short life span, then the 
short-lived object is now going to live 
much longer, as the reference from the 
long-lived object will keep it around in- 
definitely. You have to remember to call 
removePropertyChangeListener() when- 
ever the listener object is destroyed. This 
isn’t really when the listener is literally de- 
stroyed, as the garbage collector does 


that— it’s when you decide that the lis- | : www.ZeroG.com 


tener object is no longer needed by the 
ae InstallAnywhere’ 
ENTERPRISE comon Pe ae 


Deploying enterprise software to multiple 
platforms? Whether they're applications, libraries, 
or servlets written in Java or platform-specific 
code, InstallAnywhere is the only solution that 
creates a single, universal installer that can 
deploy software via the Internet or CD-ROM to 
all platforms, including Windows, Solaris, Linux, 
Mac OS and others. Companies such as Sun 
Microsystems, Adobe and FedEx rely on 
InstallAnywhere to deploy software across 
multiple platforms, scaling from the smallest 
applet to the largest client/server applications. 




















Some strategies you can use to avoid 
lapsed listeners are to make sure all the 
add and remove calls are paired. Doing 
this is as simple as using tools such as 
grep or the find command in your fa- 
vorite editor to search for calls to add- 
XXXListener and removeXXXListener. Fur- 
thermore, it’s good practice to pair them 
close together in your code and not to 
have the add and remove listener calls 


spread far apart in separate methods or . | : | Fs | ie) hislerele(s, against 


source code files. At some point in the 


future the calls are going to get decou- | ae fe) lers 
ai ro] 





© 1999 Zero G Software, Inc. InstallAnywhere, Zero G Software, the Zero G logo, and ZeroG.com are trademarks or registered trademarks of Zero G Software, Inc. All other trademarks are the property of their respective owners. 


pled and you're going to create a loiter- 
ing object problem again. Another thing, 
shown in the example, is to pay atten- 
tion to object lifecycles— creating refer- 
ences from a long-lived object to a short- 
lived object ties both objects together, 
giving them the long-lived object’s life- 
time. Finally, you might want to consid- 
er a larger solution, such as implement- 
ing a listener registry or a publish/ 
subscribe mechanism, to decouple lis- 
teners from even sources. You should be 
suspicious of any framework code that 
claims to clean up this sort of problem 
automatically, as it’s probably built on a 
set of assumptions that, if broken, will 
cause the framework to fail and possibly 


cause more loiterers. wal... _. 

Lingerers. The second type of loiterer 4ithpass Ye) Ui rceG Tel me A a @ 
is a lingerer— an object that hangs on for _ _ 

a while, after the program is finished with Proven anti-reverse engineering technology 

it. Specifically, it occurs when a reference | 
is used transiently by a long-lived object, 
but isn’t cleared when finished with. The 
next time the reference is used it will prob- 
ably be reset to refer to a different object, 
but in the meantime, the previous object 
loiters about. In C++, this would again be 
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a benign dangling pointer, where the ob- 
ject being referenced would have been 
manually freed and the bad pointer would 
have been retained, but you’d never no- 
tice, as the next time the pointer is used, 
it will be reset to point to some other valid 
object. 

An example of this might be a print ser- 
vice in an application (see Example 2). 
The print service can be implemented as 
a Singleton, as there isn’t usually any need 
to have multiple print services in an ap- 
plication. The print service contains a field 
called target. When the program calls do- 
Print(), the print service prints the object 
referred to by target. The important thing 
is that when the print service is done print- 
ing, the target reference is not set to null. 
The object that was being printed can’t 
be garbage collected now, as there’s still 
a lingering reference to it from the print- 
er object. You have to make sure that tran- 
sient references are set to null once you’ve 
finished using them. 

One strategy for dealing with lingerers 
is to encapsulate state in a single object 
as opposed to having a number of objects 
maintaining state information. This makes 
changing state easier, as there’s only one 
reference to deal with. In general, linger- 
ers often occur when objects with multi- 
ple states hold on to references unneces- 
sarily when they’re in a quiescent or 
inactive state, so you have to carefully 
consider the state-based behavior of your 
objects. Another strategy is to avoid ear- 
ly exits in methods—you should set up 
methods so that they do their setup first, 
the processing, and dues oul setae 





clean up. If you exit before the method 
has a chance to clean up, references may 
be left holding on to objects that are no 
longer needed. 

Laggards. The third type of loiterer is 
a laggard— someone (or something) who 
is always behind, never quite keeping up. 
In terms of loiterers, a laggard occurs 
when an object changes its state, but still 
has references to some data from its old 
state. Laggards are typically functional er- 
rors in addition to memory problems, but 
they’re often hard to find and may mani- 
fest themselves as memory problems be- 
fore they’re discovered as bugs. One way 
that laggards occur is when you change 
the lifecycle of a class; for example, when 
you change a class from having multiple 
instances to a Singleton, perhaps because 
it’s too expensive to keep creating new 
objects of this class. Now the single ob- 
ject of this class changes its state over time, 
as opposed to before when new instances 
were created whenever a new state was 
required. Again, comparing the situation 
to C++, this problem would probably man- 
ifest itself as a dangling pointer in C++, 
where the objects from the old state would 
have been manually removed, leaving a 
bad pointer. 

An example of this might be an object 
that maintains information about files in 
a directory, including statistics and which 
has references to the largest, smallest, and 
“most complex” file (for some definition 
of “complex”). When you change direc- 
tories, for some reason only the references 
to the largest and smallest files are up- 
dated— the reference to the most com- 
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Example 3: A method is supposed to read through a file parse items out of it, | and 


deal with certain elements in it. 
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plex file is a laggard, as it still points to 
the file in the previous directory. This is, 
of course, a bug, but it’s subtle and may 
be difficult to detect. Using a memory de- 
bugging tool, however, where you can see 
all the instances of each class, you should 
be able to see that there are more refer- 
ences to file objects than there are files in 
the directory because of the extra file be- 
ing held on to by the laggard reference. 
Approaching this problem as a lingerer as 
opposed to a bug may make it show up 
much more quickly. 

You can deal with laggards by thinking 
carefully about your caching strategies: Is 
caching really necessary or is it accept- 
able to calculate certain values dynami- 
cally? It’s useful to use a profiler to de- 
termine when and where caching is 
appropriate. Another technique is to en- 
capsulate state transitions in a single 
method, so you don’t have code scattered 
in multiple locations responsible for chang- 
ing the state of an object. Keeping relat- 
ed code in a single locality makes it eas- 
ier to maintain. 

Limbo. The fourth and final type of loi- 
terer is a limbo. Things in limbo are caught 
in between two places, while occupying 
neither of them fully. Objects in limbo may 
not be long-term loiterers, but they can 
take up a lot of memory at times when you 
don’t want them to. Limbos occur when 
an object being referenced from the stack 
is pinned in memory by a long running 
thread. The problem is that the garbage 
collector can’t do what's referred to as “live- 
ness analysis” where it would be able to 
find out that an object won't be used any- 
where in the rest of a method, thus mak- 
ing it eligible for garbage collection. 

In Example 3, the method is supposed 
to read through a file, parse items out of 
it, and deal with certain elements in it. 
This might happen if you were looking 
for a specific piece of data in an XML file, 
for instance. So the first thing the method 
does is call readIt(), which might do 
something like read in the whole file, 
which would consume a lot of memory. 
Then the method findit() goes through 
and searches for the particular informa- 
tion you're looking for, condensing all the 
information from the big object into some- 
thing much smaller. From this point on 
you don’t need big any more and you’d 
probably like to reuse the memory it’s oc- 
cupying. But when you call parselt(), 
which may take a long time, the memory 
for big can’t be reused because there’s still 
a reference to it from the stack in 
method()’s stack frame — big can’t be 
garbage collected until method() returns. 
You need to help the garbage collector 
out by setting the reference to big to null, 
as shown in Example 3 in the line that’s 
commented out. 
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One way to deal with limbos is to be 
aware of long-running methods and watch 
where large allocations are occurring, to 
make sure that you’re not creating large 
objects that are being held on the heap 
by a reference on the stack. Again, tools 
such as profilers and memory debuggers 
can help determine what methods take a 
long time to run and what objects are very 
large. Explicitly adding statements to set 
references to null in cases where large ob- 
jects are being needlessly held can make 
a big difference. While it’s not practical or 
necessary to null out every reference af- 
ter you're done with it, it helps where ap- 
propriate. A blocked thread can also be a 
problem; for example, when a thread is 
blocked waiting on I/O, no object refer- 
enced from the stack. in that thread can 
be garbage collected. 


Tools and Techniques 
There are a number of tools available to 
help you track down loiterers. One sim- 
ple thing to do is to track the objects 
you're creating manually so that you can 
programmatically monitor memory usage. 
The problem with this is, of course, that 
you have to modify your code in order to 
see what’s going on. An example of how 
to do this is demonstrated by Object- 
Tracker java (Listing One), a class that 
lets you register objects to track them to 
see if you have the expected number of 
instances. Listing Two is an example of 
using ObjectTracker. To activate Object- 
Tracker, you have to define the Object- 
Tracker system property by adding the 
command-line flag “-DObjectTracker” 
when you run the Java VM. 

One important thing to note is that suc- 
cessful use of ObjectTracker relies on the 


Java VM assigning all objects unique hash- 
codes. Unfortunately, due to differences 
in implementation, this is only guaranteed 
to be true in Sun’s JDK 1.1.x JVMs and 
not in JDK 1.2 or higher. ObjectTracker 
will appear to work in JDK 1.2.x (or the 
1.3 beta), but may not accurately track 
large numbers of objects. 

A more industrial-strength solution is to 
use a full-blown profiler and/or memory 
debugger. There are a number of com- 
mercial products available, including 
JProbe (http://www.klgroup.com/jprobe/) 
from KL Group (where we work). These 
types of commercial products can track 
all the objects in your program, let you 
browse the heap and, very importantly, 
see not only the objects but also the ref- 
erences between them— this is important 
because in Java you have to worry about 
managing the references (the edges of the 
graph formed by objects on the heap), 
and not the nodes. 

Along the same line, there are some free- 
ware tools available that make use of the 
profiling output available from JDK 1.2.x 
JVMs. The -Xrunhprof option (explained by 
running java -classic -Xrunhprof:help) can 
generate both time and memory usage in- 
formation. This data can, in turn, be inter- 
preted by tools such as HyperProf (http:// 
www.physics.orst.edu/~bulatov/HyperProt/ 
index.html/) although this seems to have 
been removed by the author for software 
patent reasons. The data format produced 
by -Xrunhprof is documented in the output 
file and on Sun’s web site (http://developer 
java.sun.com/developer/onlineTraining/ 
Programming/JDCBook/perf3.html/). 

One final possibility is to use the Java 
Virtual Machine Profiling Interface direct- 
ly (documented at http://java.sun.com/ 





products/jdk/1.2/docs/guide/jvmpi/jvmpi 
html). It can be used to monitor a num- 
ber of different internal activities inside 
the VM, such as object allocation and re- 
moval (see “What Is the Java VM Profiler 
Interface,” by Andy Wilson, DD/ Septem- 
ber 1999). The major drawback with us- 
ing JVMPI is that it’s a native interface and 
you'll have to create your own C-based 
shared object library or DLL to get the in- 
formation. While this is definitely the most 
flexible approach, it’s no small amount of 
work and it’s probably cheaper in the long 
run to buy (or better yet, get your boss to 
buy) a commercial memory- debugging 
tool. A free tool that uses JVMPI is JUM 
(http://www.iro.umontreal.ca/~lelouarn/ 
jum.html). 


Conclusion 

While Java’s garbage collection mecha- 
nism removes much of the difficulty of 
managing dynamic memory, problems can 
still occur. Most nontrivial Java programs 
will have some loitering objects present 
in them. Loitering objects are generally 
fewer than memory leaks in C++, but 
when they do occur they generally cause 
much larger problems. Removing loiter- 
ing objects can be difficult because Java 
handles memory in a fundamentally dif- 
ferent way than C++, which is what most 
developers are familiar with. You have to 
think about managing the edges, not the 
nodes on the heap. A thorough under- 
standing of object lifecycles, including life- 
times and state values, is key and must be 
used to build good memory-management 
practices into development practices. 
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Listing One 


DER OGG IO GGG CAG GI CG IGG CGI ICR IR 3 11 3 3.21 2 3 1 21 9 2k ok 8 i ok aR KK 
All Rights Reserved. 


* Copyright (c) 1999, KL GROUP INC. 
http://www.klgroup.com 


Limitations 


start()</A> for more details. 


Since you must add instrumentation to all the classes you want to track, 
this is not nearly as useful as a Memory Profiler/Debugger like 
JProbe Profiler. Also, since it cannot tell you which references 


* 
* 
* 
* 
* 
* 
* 
* 
* 
* 
* 
* 
* 
* 


The Software is provided "AS IS," without a warranty of any kind. 

ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, 
INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A 
PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE HEREBY EXCLUDED. 

KL GROUP AND ITS LICENSORS SHALL NOT BE LIABLE FOR ANY DAMAGES 
SUFFERED BY LICENSEE AS A RESULT OF USING, MODIFYING OR DISTRIBUTING 
THE SOFTWARE OR ITS DERIVATIVES. IN NO EVENT WILL KL GROUP OR ITS 
LICENSORS BE LIABLE FOR ANY LOST REVENUE, PROFIT OR DATA, OR FOR DIRECT, 
INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL OR PUNITIVE DAMAGES, 
HOWEVER CAUSED AND REGARDLESS OF THE THEORY OF LIABILITY, ARISING OUT 
OF THE USE OF OR INABILITY TO USE SOFTWARE, EVEN IF KL GROUP HAS 


BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 
BESSA S AAG AG GA I S CI CAG IG I ICI ICICI I IR 1 3 1 3 31 8 3 1 3 2 8 3 2 a 2 a 2 a A a oo aR a oR ok aK / 


import java.lang.reflect.*; 
import java.util.*; 


/* 


* 


** B€BHX KR HX He HHH He KH 
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FESS AA AA OIC IO IC IC ICR ICR ICR ICR IR 8 21 9021 1 21 1 9 21 A A I 2 I 2 2 ok ok 
Utility class for identifying loitering objects. Objects are tracked by 
calling ObjectTracker.add() when instantiated, and calling 
ObjectTracker.remove() when finalized. Only classes that implement 
ObjectTracker.Tracked can be tracked. As instances are created and 
destroyed, they are reported to the stdout. Summaries by class can also be 
reported on demand. To enable this functionality, add -DObjectTracker 
when running your program. This will track all classes that implement 
ObjectTracker.Tracked and call add/remove as indicated in the 
previous paragraph. 

For a finer degree of control, specify a list of filters 

when setting the <code>ObjectTracker</code> property. For instance, 
-DObjectTracker=+MySpecialClass,-ClassFoo will only report o 
on instances of classes whose name contains MySpecialClass 
but not ClassFoo. Hence MySpecialClassBar will be tracked, while 
MySpecialClassFoo will not be. See <A HREF="0bjectTracker.html#start()"> 


Dr. Dobb’s Journal, February 2000 


are causing the object to loiter, it doesn't help you remove the loiterers. 
If you want to solve the problem, you really need to use a Memory 
Profiler/Debugger like JProbe Profiler. The only thing ObjectTracker can 
help with is testing whether an instance of a known class goes away. 
Implementation Notes 

The current implementation assumes that every object has a unique 

hashcode. A false assumption in general, but does work in JavaSoft's Win32 
VM for JDK1.1. This implementation will definitely not work in JavaSoft's 
implementation of the Java 2 VM, including the HotSpot VM. 
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public class ObjectTracker { 


* *¥ %*¥ X*¥ XX KX KX KX HX KX 


* 


// Property ObjectTracker turns this on when set 
private final static boolean ENABLED = 

System.getProperty("ObjectTracker") != null; 
// Classes are hashed by name into this table. 
private static Hashtable classReg; 
private static Vector patterns; 
/** Record info about an object. Class and ordinal number are stored. */ 
private static class ObjectEntry f{ 

int ordinal; // distinguishes between mult. instances 

String clazz; // classname 

String name; // name (may be null) 


public ObjectEntry(int ordinal, String clazz, String name) { 
this.ordinal = ordinal; 
this.clazz = clazz; 
this.name = name; 
} 
public String toString() f{ 


(continued on page 121) 
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return clazz + ":#" + ordinal + " ("+namet+")"; 
} 
} // ObjectEntry 
/** Records info about a class. Within each class, a table of objects is 
* maintained, along with next ordinal to use to stamp next object 
* of this class. */ 
private static class ClassEntry { 


String clazz; // class name 
Hashtable objects; // list of ObjectEntry 
int ordinal; // last instance of this class created 


public ClassEntry(String clazz) { 
this.clazz = clazz; 
objects new Hashtable() ; 
ordinal 1; 


} 
public String toString() { 
return clazz; 
} 
/** Get the name of the object by invoking getName(). 
* Uses reflection to find the method. */ 
private String getName(Object o) { 
String name = null; 
try { 
Class cl = o.getClass(); 
Method m = cl.getMethod("getName", null); 
name = (m.invoke(o, null)).toString(); 
} 
catch (Exception e) { } 
return name; 
} 
public void addObject(Object obj) { 
// Store this object in the object table 
Integer id = new Integer (System. identityHashCode (obj) ) ; 
ObjectEntry entry = new ObjectEntry(ordinal, clazz, getName(obj)); 
objects.put(id, entry); 
ordinaltt; 
System.out.println(" added: " +entry); 
} 
public void removeObject (Object obj) { 
// Removes this object from the object table 
Integer id = new Integer (System. identityHashCode(obj)) ; 
ObjectEntry entry = (ObjectEntry) objects. get (id) ; 
objects. remove (id) ; 
System.out.println(" removed: " t+entry) ; 
} 
/** Dump out a list of all object in this table */ 
public void listObjects() { 
if (objects.size() == 0) f 
// skip empty tables 
return; 
} 
System.out.println("For class: " + clazz); 
Enumeration objs = objects.elements() ; 
while (objs.hasMoreElements()) { 
ObjectEntry entry = (ObjectEntry) objs.nextElement () ; 
System. out.println(" " +entry) ; 
} 
} 
} // ClassEntry 
/** No constructor */ 
private ObjectTracker() {} 
/** Determine is this class name should be tracked. 
* @return true if this class should be tracked. @see start */ 
private static boolean isIncluded(String clazz) { 
int i=0, size = patterns.size(); 
if (size == @) { 
// always match if list is empty 
return true; 
} 
boolean flag = false; 
for (; i<sise; itt) ‘{ 
String pat = (String) patterns.elementAt (i) ; 
String op = pat.substring(@, 1); i] > or = 
String name = pat.substring(1); 
if (name.equals("all")) { 
if (op.equals("+") ) 
flag = true; // match all, unless told otherwise 
else if (op.equals("-")) 
flag = false; // match nothing, unless told otherwise 
} 
else if (clazz.indexOf(name) != -1) { 
// match if any of the filter names is a substring of 
// the class name 
if (op.equals("+") ) 
return true; 
else if (op.equals("-")) 
return false; 
} 
} 
return flag; 


} 
/** Must be called before any objects can be tracked. Turns on object tracking 


* if property <code>ObjectTracker</code> is set. In addition, the list of 
* patterns assigned to this property is stored for future pattern matching 
* by <code>isIncluded()</code>. This list of patterns must be supplied as a 
* comma-separated list, each preceded by <code>t+</code> or <code>-</code>, 
* which indicates whether or not the pattern should cause matching classes to 
* be tracked or not. If property <code>ObjectTracker</code> has no values, 
* it is equivalent to <code>tall</code>. */ 
public static void start() { 
if (ENABLED) { 
classReg = new Hashtable(); 
patterns = new Vector(); 


String targets = System. getProperty("ObjectTracker") ; 
StringTokenizer parser = new StringTokenizer(targets, ","); 
while (parser.hasMoreTokens()) { 

String token = parser.nextToken(); 
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patterns.addElement (token) ; 
} 
} : 
/** Add object to the tracked list. Will only be added if object's class has 
* not been filtered out. @param obj object to be added to tracking list */ 
public static void add(Tracked obj) { 
if (ENABLED) { 
String clazz = obj.getClass().getName() ; 
if (isIncluded(clazz)) { 
ClassEntry entry = (ClassEntry) classReg.get (clazz) ; 
if (entry == null) { 
// first one for this class 
entry = new ClassEntry(clazz) ; 
classReg.put(clazz, entry); 


} 
entry.addObject (obj); 


} 


} 
/** Removes object from tracked list. This method should be called 


* from the finalizer. @param obj object to be removed from tracking list */ 
public static void remove(Tracked obj) { 
if (ENABLED) { 
String clazz = obj.getClass().getName() ; 
if (isIncluded(clazz)) { 
ClassEntry entry = (ClassEntry) classReg.get (clazz) ; 
entry. removeObject (obj) ; 


} 
} 
/** Print tracked objects, summarized by class. Also prints a 
* summary of free/total memory. */ 
public static void dump() { 
if (ENABLED) { 
Enumeration e = classReg.elements() ; 
while (e.hasMoreElements()) { 
ClassEntry entry = (ClassEntry) e.nextElement () ; 
entry.listObjects(); 
} 
System. out. println ( Nesssssssnaasssssssssssssass=s====="') : 
System.out.println("Total Memory: " + 
Runtime. getRuntime().totalMemory()) ; 


System.out.println("Free Memory: " + 
Runtime. getRuntime().freeMemory()) ; 
System. out. printin ( WesssaasaaaassssSSS================"') : 


System.out.println("") ; 
} 
} 
/** All classes that want to use this service must implement this 
* interface. This forces this class to implement Object's finalize 
* method, which should call <code>ObjectTracker.remove()</code>. */ 
public interface Tracked { 
/** All classes that use ObjectTracker must implement a finalizer. */ 
void finalize(); 


e e 
Listing Two 
DOCS GRACES IASG GGG GGG GSS CI GCSE EIGER RACK I A ACK 
* Copyright (c) 1999, KL GROUP INC. All Rights Reserved. 
* http://www.klgroup.com 
* The Software is provided "AS IS," without a warranty of any kind. 
* ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, 
* INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A 
* PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE HEREBY EXCLUDED. 
* KL GROUP AND ITS LICENSORS SHALL NOT BE LIABLE FOR ANY DAMAGES 
* SUFFERED BY LICENSEE AS A RESULT OF USING, MODIFYING OR DISTRIBUTING 
* THE SOFTWARE OR ITS DERIVATIVES. IN NO EVENT WILL KL GROUP OR ITS 
* LICENSORS BE LIABLE FOR ANY LOST REVENUE, PROFIT OR DATA, OR FOR DIRECT, 
* INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL OR PUNITIVE DAMAGES, 
* HOWEVER CAUSED AND REGARDLESS OF THE THEORY OF LIABILITY, ARISING OUT 
* OF THE USE OF OR INABILITY TO USE SOFTWARE, EVEN IF KL GROUP HAS 
* 
* 


BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 
FC CC RR CR RR CARR CC RR ACR CR A RR RRR RRR oo oo a oo ao / 


public class tester implements ObjectTracker.Tracked { 
private int[] junk = new int[5000]; 
public static void main(String args[]) { 
ObjectTracker.start(); 
for (int i=; i<1000; i++) { 
tester t = new tester(); 
t.doNothing(); 
if (i%100 == 0) { 
System.gce(); 
} 
} 
ObjectTracker.dump() ; 
} 
public tester() { 
ObjectTracker.add(this) ; 
} 


public void finalize() { 
ObjectTracker.remove(this) ; 

} 

public void doNothing() { 

} 
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High-Speed Cryptography 
with the RSA Algorithm 


Michael J. Wiener 


n 1978, Ronald Rivest, Adi Shamir, and 

Leonard Adleman published the RSA 

public-key cryptosystem (see “A 

Method for Obtaining Digital Signatures 
and Public Key Cryptosystems,” by R. 
Rivest, A. Shamir, and L. Adleman, Com- 
munications of the ACM, February 1978). 
Since then, public-key cryptography has 
become a critical technology for confi- 
dentiality and trust in messaging and on- 
line transactions on the Internet. There are 
many choices for symmetric cryptosys- 
tems, but only a few public-key algorithms 
are available. The two main contenders 
right now are RSA and elliptic-curve cryp- 
tography. Here, we focus on implement- 
ing RSA. 

RSA can be implemented in hardware, 
but great performance can also be 
achieved in software running on general- 
purpose processors. In fact, many (but not 
all) hardware implementations of RSA are 
actually much slower than RSA on a PC. 
In this article, I’ll explain some of the key 
optimizations (with source code exam- 
ples) that can be made to make RSA as 
fast as possible. 

The speed of RSA varies greatly from 
one implementation to the next. It is not 
unusual to see RSA code that is 100 times 
slower than the best available for a given 
platform. There are many tricks that boost 
performance by a few percent, but the 
most important ones make the code up 
to four times faster. Combining these ideas 
can make a speed difference of one to 
two orders of magnitude. Here, I'll dis- 
cuss optimizations that apply to the un- 
derlying large integer math, the encryp- 
tion and decryption operations, and key 
generation. I'll also give RSA performance 
figures for what can be achieved on an 
Intel PC. 


Michael is a senior cryptologist with En- 


trust Technologies. He can be reached at 
wiener@entrust.com. 
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RSA Math Background 

Creating an RSA key pair begins with se- 
lecting two large prime numbers p and g at 
random. A typical size for each of these 
primes is 512 bits. The product of these 
primes is called the RSA modulus n=p xq. 
To say that we are using 1024-bit RSA means 
that the RSA modulus is 1024 bits long. 

Next, a public exponent e is chosen. A 
common choice is e=2!©+1=65537. Then 
the private exponent d is computed such 
that ed-1 is divisible by both p—1 and 
g-1. This makes e and d inverses of each 
other, in a sense to be discussed shortly. 

To encrypt a message (which is often a 
symmetric key used to encrypt the real mes- 
sage), it must first be encoded as an inte- 
ger m between 0 and 7-1. The ciphertext 
is computed with a modular exponentia- 
tion operation: c=m® mod n. The special 
form of the aforementioned private expo- 
nent d makes it possible to recover the mes- 
sage from the ciphertext: m=c4 mod. n. 

The quantities e and n (but not p and 
qg) can be made public, allowing anyone 
to encrypt a message, but only the own- 
er of the private exponent d can decrypt 
the message. For attackers to find d, they 
must factor 1 into the primes p and q. For 
a 1024-bit modulus 7, this is infeasible 
using the best factoring method known, 
the general-number field sieve (see The 
Development of the Number Field Sieve, 
Lecture Notes in Math. 1554, edited by A. 
Lenstra and H. Lenstra, Jr., Springer- 
Verlag, 1993). 

If the goal is to compute a digital sig- 
nature on a message (or hash of a mes- 
sage) rather than to encrypt it, the owner 
of the private exponent computes s=74 
mod nv. Then anyone can recover m and 
verify that the signature must have been 
created by the owner of d by computing 
m=s© mod n. 

For a more complete mathematical 
description of RSA, see Handbook of 
Applied Cryptography, by A. Menezes, 
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P. van Oorschot, and S. Vanstone (CRC 
Press, 1997). 


Targets of Optimization 

The main target of optimization is the 
modular exponentiation operation that 
consists of raising one number to some 
power modulo a third number: m* mod 
n. A naive approach is to start with m, 
multiply by m x-1 times, divide by n, and 
take the remainder. This would be fine 
for very small numbers, but for RSA the 
private exponent d is very large, and do- 
ing d-1 multiplies is completely infeasi- 
ble. A faster way can be illustrated using 
the exponent 19: m!=(((m?)?)*xm)?xm. 

Instead of 18 multiplies, you need only 
four squares and two multiplies. The next 
problem is that m* can be extremely 
large. This is dealt with by dividing by 
n and taking the remainder after every 
square and multiply instead of waiting 
until the end. 

Of the five further RSA optimizations 
described in the following, four apply to 
modular exponentiation and one is spe- 
cific to prime number generation. 


Large Integer Representation 
A frequent operation in modular expo- 
nentiation is multiplying two integers. How- 
ever, these integers are generally much larg- 
er than the maximum integer size 
supported by compilers. Some strategy is 
needed for breaking the problem down. 
To multiply large numbers, children are 
taught to break the problem down into 
many products of single-digit numbers 
and to add all the partial products. The 
same can be done with computers, ex- 
cept that you use bytes or words instead 
of digits; see Figure 1(a). Each large in- 
teger is represented as an array of words. 
The words are small enough that you 
can easily compute the product of two 
words. To multiply two large integers, 
consisting of three words each, would 
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require 3x3=9 small products to be com- 
puted and summed appropriately to pro- 
duce the final result consisting of six 
words (70 to 15). 

For C/C++ programmers, it is tempting 
to use 16-bit words so that the small prod- 
ucts are 32 bits long and fit in an unsigned 
long int. However, most 32-bit machines 
have an instruction that takes two 32-bit 
unsigned integers and computes the 64- 
bit result. The problem is that for many 
compilers, the upper half of the result is 
not accessible. To get the full 64-bit prod- 
uct requires writing a small amount of as- 
sembly code. Example 1 shows inline as- 
sembly for Microsoft’s Visual C++ on an 
Intel platform. 

Using assembly code here is well worth 
it. Going from a large integer representa- 
tion based on 16-bit words to 32-bit words 
cuts down the number of small products 
by a factor of four. This is possible on 
most 32-bit processors. Depending on the 
particular platform, this can save between 
a factor of two and four in the run time 
of all RSA operations. This is the only 
place in the RSA software where dramat- 
ic speed improvement is possible using 
assembly code. Intel’s Merced chip promis- 
es further improvement by offering an in- 
teger multiply and add instruction that 
takes two 64-bit unsigned integers and 
produces the full 128-bit product (see “ 
64 Application Developer's Architecture 
Guide,” May 1999, http://developer.intel 
.com/design/ia64/architecture. htm.) 


Faster Squaring 

The method used for modular exponen- 
tiation involves both multiplying and 
squaring large integers. The squaring can 
be done simply by using the multiply func- 
tion with the arguments equal, but there 
is a faster way. In Figure 1(b), you see 
that three of the small products appear 
twice (for instance, axb=bxa). Instead of 








computing each one twice, you can com- 
pute them once and add them twice to 
the final result. This does not save much 
for the small example in the figure, but 
for the larger numbers used in RSA, this 
can make squaring almost twice as fast as 
multiplication. 


Divide and Conquer 

To decrypt or digitally sign, the owner of 
the private exponent computes r=y4 mod 
n, where y is either a ciphertext to be de- 
crypted or a message (or hash of a mes- 
sage) to be signed, and 7 is the resulting 
plaintext message or digital signature. A 
clever observation is that because n=pxq, 
the Chinese Remainder Theorem says that 
you can compute r mod p and r mod g, 
and then combine them to get 7 mod 1 
(see J.J. Quisquater and C. Couvreur’s “Fast 
Decipherment Algorithm for RSA Public- 
Key Cryptosystem,” Electronics Letters, Oc- 
tober, 1982). 

To take advantage of this performance 
improvement, a 1024-bit RSA private key 
is not really stored simply as two 1024-bit 
values (d,n) as described earlier, but is 
stored as five 512-bit values (d A oP.) 
Three of these values require explanation. 
When a new RSA key pair is created, these 
values are computed as follows: d,=d mod 
(p-L), dg=d mod (qg-1), and k is comput- 
ed such ‘that kq-1 is divisible by p. This 
quantity R is used in the final stage of con- 
structing 7. The process of computing r=y@ 
mod n begins with computing 7,=y“? mod 
pand rg=y44 mod q. These are ica come 
bined to form r as follows: 


r=((%ptp —(%_ mod p)xk mod p)xqtry 


The added p before subtracting 7, is there 
simply to avoid having to deal with neg- 
ative numbers. It is more efficient to deal 
with only unsigned quantities. Normally, 
the owner of the private key also stores 
the public oy e and possibly n (or 


(oy) squaring. 


Example 1: Inline any for Microsoft's Visual Ce. 
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he may choose to compute n=pxg when 
necessary to save space). 

It is natural to ask at this point whether 
all this is really worth the trouble. The 
short answer is yes. The time required to 
perform a modular exponentiation is pro- 
portional to the cube of the size of the 
numbers involved. This means that each 
of the half-size exponentiations mod p 
and mod g take one-eighth as long as a 
big one mod vn. The time required to do 
the final step of combining partial results 
is small by comparison. The overall im- 
provement is almost a factor of four. Com- 
puting the extra information for a private 
key (dy, d p,q) is negligible and happens 
only at the time of key generation. 


Sliding Window Method 

I'll now examine a way to reduce the 
number of multiplies required to carry out 
a modular exponentiation. In the exam- 
ple earlier, we showed how to efficiently 
compute m!? by taking m, squaring it 
three times (to make m®), multiplying by 
m, squaring, and finally multiplying by m 
again to get m!. Looking at the binary 
representation of 19 (10011) suggests a 
general approach that works for any ex- 
ponent. First, find the most significant set 
bit in the exponent and set the current re- 
sult to be m. Then, for each successively 
less significant bit of the exponent, square 
the current result and if the exponent bit 
was set, multiply the result by m. This is 
a fairly simple approach that requires a 
square for each exponent bit (after the 
first one) and a multiply for each set ex- 
ponent bit (after the first one). 

The number of squares cannot be re- 
duced much, but the number of multiplies 
required by the approach just described 
can be reduced significantly for large ex- 
ponents using the sliding window method. 
This method deals with several exponent 
bits at a time instead of handling them 
one by one. You begin by selecting a win- 
dow size w. For RSA-1024, w=6 is most 
efficient. Modular exponentiation proceeds 
in three stages described in the following 
steps: 


1. Table computation. Compute all the 
odd powers of m up to 2” using a 
square and 2-1-1 multiplies. For w=6, 
this means computing m!, m3, ..., m3 
and storing them in a table. 

2. Initialize exponent scanning. Next, we 
take the top window of w bits of the 
exponent and compute that power of 
m and assign it to the current result. If 
the w exponent bits form an odd val- 
ue, then the appropriate table entry can 
be assigned to the current result. If the 
w exponent bits form an even value, 
then we can form that power of m by 
squaring a table entry or multiplying 
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two entries and assigning the product 
to the current result. In Example 2, the 
initial exponent bits are 110, meaning 
that we have to initialize the result to 
m® (by squaring m3 from the table). 

3. Main Loop. Repeat the following until 
the exponent is used up. Starting from 
the end of the last window processed, 
scan for the next set bit in the exponent 
and square the result for each zero bit 
encountered along the way. When a “1” 
bit is found, take the next window of w 
exponents bits (including the first set bit) 
if there are that many. Otherwise, just 
take as many exponent bits as are left. 
Suppose that this string of exponent bits 
ends in z zero bits and begins with y 
bits that form an odd value. Then square 
the result y times, multiply by the table 
entry corresponding to the y bits, and 
finally square the result z times. This trick 
eliminates the need to store even pow- 
ers in the table. In Example 2, after set- 
ting the result to m°, you have two zero 
bits each requiring a squaring, and then 
a window of 110 that ends in z=1 zero 
bits and whose initial y =2 bits form the 
odd value 3. This means that you square 
twice, multiply by m5, and then square 
once more. 


Listing One implements the sliding win- 
dow method. For 1024-bit RSA, this 
method does not significantly affect the 
number of squaring operations required, 
but it reduces the number of multiplies 
by more than a factor of two. 


Sieving 

A common way to generate the prime 
numbers needed for an RSA key pair is to 
generate random numbers of the desired 
size and perform a primality test on each 
one until a prime number is found. The 
most efficient primality tests are proba- 
bilistic; they can say that a number might 
be prime with some probability or that it 
is definitely not prime. If a given number 
passes the test enough times, then the prob- 
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2: Small example using w=3. 
ee Re PRR = sr pa 





ability of it not being prime is so low that 
we can declare it a highly probable prime 
and use it for RSA. A good probabilistic pri- 
mality test is the Miller-Rabin test (again, 
Handbook of Applied Cryptography, by A. 
Menezes, P. van Oorschot, and S. Vanstone, 
CRC Press, 1997). 

A first thought on this approach to gen- 
erating prime numbers is that it makes no 
sense to test whether an even number is 
prime. If you generate only large odd can- 
didates to test, the process should run 
twice as fast. Taking this a step further, 
you could check whether the candidate is 
divisible by 3 or 5 before running the 
Miller-Rabin test. This can be extended to 
checking that the candidate is not divisi- 
ble by any prime below some bound. For 
a bound of 4096, you save a factor of 15 
in the number of primality tests performed. 
However, the cost of checking each can- 
didate for divisibility by the primes up to 
the bound can become expensive. Much 
of this cost can be eliminated using a tech- 
nique called “sieving,” where you seek the 
first prime number after a random starting 
candidate by efficiently eliminating multi- 
ples of small primes. 

An efficient approach based on sieving 
begins with generating a random odd can- 
didate c. Create an array of Boolean flags, 
where element 0 corresponds to c, element 
1 corresponds to c+2, and in general, ele- 
ment 7 corresponds to c+2i. Initially, set all 
flags to 1 indicating that the correspond- 
ing integer c+2i might be prime. For each 
small prime s less than the bound, find the 
smallest value 7 such that c+2i is divisible 
by s and turn off flags 7, i+s, i+2s, up to the 
end of the array. This eliminates all multi- 
ples of s as prime candidates. After this pro- 
cess is done for all small primes less than 
the bound, the flags that are still on cor- 
respond to possible primes that can be test- 
ed with the Miller-Rabin test. If none of the 
remaining candidates are prime, you can 
repeat with c+2/ in place of c, where / is 
the length of the array. Listing Two im- 
plements sieving. 






Table 1: Performance based on Entrust’s Implementation of RSA. 
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Performance 

The performance figures in Table 1 are 
based on the RSA implementation by En- 
trust. Additional information is available 
at either http://developer.entrust.com/ 
or http://developer.entrust.ch/. The run 
times are based on a 400-MHz Pentium 
II, a public exponent of e=65537, and 
do not include overhead such as hash- 
ing a message. 

For the currently recommended RSA key 
size of 1024 bits, the operations of en- 
crypting, decrypting, signing, and verifying 
signatures are very fast. Even key pair gen- 
eration, which is not required frequently, 
takes less than a second. Digital certificates, 
which enforce the usable lifetime of a key 


pair, most frequently specify lifetimes of at 
least a year, and so a fraction of a second 
for key pair generation is a small overhead. 
But what if you are forced to use larger 
key sizes in the future because attackers will 
have faster computers as Moore’s law march- 
es on? RSA-2048 is about a billion times 
stronger than the already strong RSA-1024, 
and even today’s RSA-2048 operations are 
quite fast. If attackers have faster comput- 
ers in the future, then so will users who are 
protecting their data. Moore’s law favors the 
cryptographer, not the cryptanalyst. 


Conclusion 
When RSA was first introduced, it ran 
very slowly on the computers that ex- 


isted at the time. But, the continuous 
drive to faster and faster computers has 
brought us to a point where RSA is very 
fast and practical to use. As computers 
become faster still, attackers will have 
more power for trying to break RSA, but 
this can be countered easily by increas- 
ing key sizes. Doubling the key size to 
2048 bits costs the cryptographer less 
than a factor of eight in RSA signing time, 
but costs the cryptanalyst about a billion 
times more work to attack RSA. This 
makes it quite easy for the cryptogra- 
pher to stay ahead. 
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Listing One 


// To use this code, a BigInt class to handle big integers 
// is needed along with the following 5 functions. 


// set up an exponent to point to the top set bit for next_exponent_bit() 
void init_exponent(BigInt exponent) ; 


// get next exponent bit 
unsigned long next_exponent_bit(); 


// return TRUE if exponent used up 
int exponent_finished(); 


// result = result*result % modulus 
void modular_square(BigInt &result, BigInt modulus) ; 


// result = result*x % modulus 
void modular_multiply(BigInt &result, BigInt x, BigInt modulus) ; 


#define WINDOW 6 
#define TABLE LEN 32 
BigInt table [TABLE_LEN] ; 


void fill_table(BigInt message, BigInt modulus) 
{ 


BigInt sq; 
long i; 
Sq = message; 
modular_square(sq, modulus) ; 
table[0] = message; 
for (i = 1; i < TABLE_LEN; i++) 
{ 
table[i] = table[i - 1]; 
modular_multiply(table[i], sq, modulus) ; 
} 
} 
BigInt modular_exponentiation(BigInt message, BigInt exponent, BigInt modulus) 


BigInt result; 
long started, num_pending_bits, num_zero_bits, i; 
unsigned long pending bits, bit; 


fill_table(message, modulus) ; 
init_exponent (exponent) ; 
if (exponent_finished () ) 
return 1; 
started = @; 
num_pending bits = @; 
pending bits = @; 
do 
if 
while (!exponent_finished() && ((bit = next_exponent_bit()) == 9)) 
modular_square(result, modulus) ; 
num_pending bits = pending bits = bit; 
while (!exponent_finished() && (num_pending_bits < WINDOW) ) 
{ 
pending bits = (pending_bits << 1) + next_exponent_bit(); 
num_pending_bits++; 
} 
if (num_pending_bits > @) 
{ 


if (!started) 
‘ 
if (pending_bits & 1) 
result = table[pending bits >> 1]; 
else if (pending bits & 2) 
{ 


result = table[pending_bits >> 2]; 
modular_square(result, modulus) ; 


} 
else 
{ 
result = table[(pending_bits-1) >> 1]; 
modular_multiply(result, message, modulus) ; 
} 
started = 1; 
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t 
for (num_zero_bits = 0; !(pending_bits & 1); num_zero_bitst+) 
pending bits >>= 1; 
for (i = num_zero_bits; i < num_pending_ bits; itt) 
modular_square(result, modulus) ; 
modular_multiply(result, table[pending bits >> 1], modulus) ; 
for (i = 0; i < num_zero_bits; i++) 
modular_square(result, modulus) ; 
} 


} 
} 
while (!exponent_finished()) ; 
return result; 


Listing Two 


// To use this code, a BigInt class to handle big integers 
// is needed along with the following function. 


// return true if candidate is a highly-probable prime 
int miller_rabin_test(BigInt candidate) ; 


#define SMALL_PRIME_BOUND_DIV2 2048 
#define SQRT_BOUND 64 
char small_prime_flags [SMALL_PRIME_BOUND_DIV2] ; 


// sieve to find small primes 


// small_prime_flag[i] == 1 means 2*i+1 is prime 
void generate_small_primes() 
{ 


unsigned long i, sieve_val; 
small_prime_flags[@] = @; // 1 is not prime 
for (i = 1; i < SMALL_PRIME_BOUND_DIV2; it+) 
small_prime_flags[i] = 1; 
// for each odd number, throw out its multiples 
for (sieve_val = 3; sieve_val <= SQRT_BOUND; sieve_val += 2) 
for (i = sieve_val + (sieve_val >> 1); i < SMALL_PRIME_BOUND_DIV2; 
i t= sieve_val) 
small_prime_flags[i] = 0; 
} 
#define SIEVE_LEN 2048 
BigInt generate_large_prime(BigInt start) 


unsigned long small_prime, i, sp, candidate; 
char sieve_array [SIEVE_LEN] ; 
generate_small_primes(); 

start |= 1; // force starting point odd 
for (;; start += 2*SIEVE_LEN) 

{ 


for (i = @; i < SIEVE_LEN; itt) 
sieve_array[i] = 1; 

for (sp = 0; sp < SMALL_PRIME_BOUND_DIV2: sp++) 
if (small_prime_flags|[sp] ) 

{ 


small_prime = 2*sp + 1; // next prime to sieve with 
// magic to find i such that small_prime divides start+2*i 
i = (small_prime - 1) - ((start - 1) % small_prime) ; 
at 4..& 1) 
i += small_prime; 
i /= 2; 
// remove multiples of small_prime 
for (; i < SIEVE_LEN; i += small_prime) 
sieve_array[i] = @; 
} 
// test primality of remaining candidates 
for (i = 0; i < SIEVE_LEN; it+) 
if (sieve_array [i] ) 
{ 
candidate = start + 2*i; 
if (miller_rabin_test (candidate) ) 
return candidate; 
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DR. ECCO’S OMNIHEURIST CORNER 


Sticks 


Dennis £. Shasha 


century earlier, he would also have 

been British. Only then, he would 

have been defending Britain’s im- 

perial privileges. Now, he was on a 
humanitarian mission — at least that’s how 
matters appear. 

General Nigel Collins stood straight in 
his spotless uniform and explained the 
problem he was having in the Balkan dis- 
trict under his command. 

“I don’t know who laid the mines or 
why,” he began. “Whichever side it was 
must have known that they were endan- 
gering their own people as much as any- 
one else. It’s a very nasty business, very 
nasty indeed. It’s the worst when you see 
the kids who...” 

He paused, shaking his head slightly. 
Regaining his professional air, he went on: 
“In any case, removing a certain kind of 
mine has proven to be a bit more difficult 
than we had anticipated. The problem 
mines are called “stick mines.” They are 
laid down in long straight lines, can sense 
movement anywhere along their length, 
and can blow up anything nearby. We have 
to clear them and we don’t know how, 
unless we know exactly where they are. 
They are difficult to detect because they 
go off only if they are stepped on and only 
with some small probability based on an 
internal clock. This means that they may 
lie dormant for years and then blow up 
when a group of youngsters are in the 
middle of a cricket match. And that’s not 
altogether unlikely, because the infested 
area is an enormous children’s park, mea- 
suring 3-kilometers long and 1-kilometer 
wide and there are 12 stick mines. We 
know certain distances between their end- 
points, but we know neither their lengths 
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Puzzling Adventures of Dr. Ecco (Dover, 
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Computer Scientists (Springer-Verlag, 1998). 
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nor (in most cases) their orientation. We 
do know that no two sticks touch. Because 
the park is more or less a rectangle, we 
describe the distances we know about us- 
ing the directions up, down, left, right. 

“We've labeled the segments as follows 
for purposes of discussion: AQ, PB, OC, 
KM, JN, RE, WX, LE, UD, IS, VT, and GH. 

“We know the following (all distances 
are in eighths of kilometers): 


A is 4 to the left of B 
B is 9 to the left of R 
C is 2 to the left of X 
X is 3 to the left of F 
F is 1 to the left of E 
E is 3 to the left of D 
P is 3 to the left of O 
O is 6 to the left of M 
M is 1 to the left of N 
N is 4 to the left of L 
L is 2 to the left of U 
U is 5 to the left of S 
S is 2 to the left of T 
K is 9 to the left of J 
J is 5 to the left of I 

I is 6 to the left of V 
V is 3 to the left of G 
W is 9 to the left of H 


GH (G on top) is vertical and of length 1 
UD (U on top) is vertical and of length 3 
RF (R on top) is vertical and of length 1 


Q is 2 above P 
K is 2 above Q 
O is 2 above B 
M is 3 above X 
J is 4 above M 
lis 1 above W 
W is 3 above L 


A is to the left and downward of P 
C is to the right and downward of A. 


“We also know that T, H, G, V, I, J, K, 
A, C, X, F, E, and D are on the edges of 
the park, though not necessarily in the 
corners. No stick mine extends beyond 
the border of the park. 

“We have learned all of this through 
our spies and through, well, the ‘per- 
suasion’ of our allied security forces. Un- 
fortunately, this does not quite solve the 
problem for us. In fact, we haven’t yet 
been able to construct a map of the stick 
mines. And that, Dr. Ecco, is why I’m 
here. Could you construct a map that is 
consistent with this evidence or tell me 
that the evidence is inconsistent? If it’s 
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inconsistent, we’ll ask our friends to try 
a bit more...persuasion. I’ve seen one 
wounded child too many.” 

“This might take a while,” Liane volun- 
teered. “General, could you give us a day?” 

“If you insist,” the general replied. 
“There are children in the neighborhood 
and it is all we can do to restrain them 
from playing in the park. The equipment 
is too tempting.” 


Reader: Is the data consistent or not? 
Draw a map that is consistent with as 
much data as you consider consistent. Is 
any of the data redundant? 


Ecco and Liane worked out a map, but 
they didn’t tell me whether there were any 
inconsistencies before I left for a confer- 
ence the next day. During the conference, 
the following question kept gnawing at 
me: Was there only one possible map, sev- 
eral possibilities, or an infinite number? If 
there were many, how should they be 
characterized? 


Reader: If you have any insight into this 
question, please let me know. A charac- 
terization may say, for example, that there 
are an infinite number of maps but they 
are offset only by translation. 


Last Month’s Solution 

Ecco and Liane suggest the following or- 
dering of the scenes in last month’s 
problem. 


Casta Patt 

Casta 

Hacket Murphy Casta Patt 

Patt 

Hacket 

Scolaro Patt Brown 

Patt Hacket Brown Murphy 

Hacket Thompson McDougal Murphy Brown 
Scolaro McDougal Hacket Thompson 
Thompson Murphy McDougal Patt 

Casta Mercer 

Casta McDougal Mercer Scolaro Thompson 
Casta McDougal Scolaro Patt 

Mercer Anderson Patt McDougal Spring 
Scolaro McDougal Casta Mercer 

Mercer Murphy 

Thompson McDougal Anderson Scolaro Spring 
McDougal Scolaro Mercer Brown 

Anderson Scolaro 


This gives a final price of $3,517,350.00. 
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Reader Notes 
The extremely clever reader solutions to 
the “Calabaza” problems (DD/, November 
1999) that I'll discuss in a moment con- 
trasted brilliantly with the bug in the so- 
lution I presented last month. 

First, my bug: The third question asked 
about the situation in which one hole 
drained in m minutes, and two holes in m 





minutes where m is the integer below half 


of n. The question asked whether any 7 
would be troublesome in that situation 
where troublesome means that there 
would be no way to mark minutes. As 
pointed out first by Pearl Pauling and then 
later by the other readers mentioned be- 
low, no solution is possible if 7 and m are 
both even. That is not the case for n=20, 
for example, but it is true for 7=22 and 
n=18. Please don’t tell Dr. Ecco. 

Otherwise, several readers found solu- 
tions that were similar to Dr. Ecco’s: Greg 
Smith, Pearl Pauling, Robert H. Morrison, 
Yves Piguet, Steve Kietzman, Jonathan 
Parker, Richard W. Lipp, Rodney Meyer, 
Patrick R. Schonfeld, Rollin Crittendon, 
Chad Harrington, Alan E. Dragoo, David 
Stevenson, Marty Pinaud, and Burghard 
Hoffrichter. 

And then there were the unconventional 
solutions that involved changing the num- 
ber of open holes in midstream, if you'll 
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excuse the pun. Philip Straite (and later 
Kevin A. Shepherd) proposed the first 
one (imagining that the holes are filled 
with corks): “Consider 5 calabazas, A 
through E. Here is the basis for generat- 
ing a calabaza-emptying event every 
minute between 15 and 19 minutes in- 
clusive. Once you have these, it is a sim- 
ple matter to refill and pull both corks 
from the calabazas and repeat each on a 
5 minute interval. 

“Calabaza A. TO (time 0) pull both corks 
(i.e., pull both corks at time 0). T5 empties 
(i.e., calabaza A empties at time 5). T5 refill 
and pull both corks, repeat every 5 minutes. 

“Calabaza B. TO pull one cork. T11 emp- 
ties. T11 refill and pull both corks, repeat 
every 5 minutes. 

“Calabaza C. TO pull one cork. T5 re- 
place cork. T11 pull one cork. T17 emp- 
ties. T17 refill and pull both corks, repeat 
every 5 minutes. 

“Calabaza D. T11 pull both corks. T15 
replace both corks. T17 pull both corks. 
T18 empties. T18 refill and pull both corks, 
repeat every 5 minutes. 

“Calabaza E. T10 pull both corks. T11 
replace both corks. T15 pull both corks. 
T19 empties. T19 refill and pull both corks, 
repeat every 5 minutes.” 

Jimmy Hu, Scott J. Taylor, and Jeff 
Hafner proposed the following even 
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quicker and more economical solution 
(I’m quoting here from Jimmy): “Fill 2 
gourds. Open one hole in gourd A and 
two holes in gourd B. At 5 minutes, gourd 
B will drain, so refill it. At 10 minutes, 
gourd B will drain again so quickly plug 
the holes and put it under gourd A so that 
gourd A (which has 1 minute left of wa- 
ter) will be draining into gourd B. So now 
you can time every minute by letting the 
gourds drain into each other and switch- 
ing them every minute. That is to say, at 
11 minutes, when gourd A drains, plug its 
holes and put it under gourd B and then 
open 1 hole in gourd B.” 

As Alan Dragoo points out, this solu- 
tion requires that the two gourds have the 
same water flow per second with one hole 
open and the same water flow per second 
with two holes open. (That is stronger 
than saying that they will drain at the same 
rate, do you see why?) 

Ralph Fellow and Magne Oestlyngen 
showed how to match the time of this so- 
lution but without this stronger assump- 
tion. Their solution required 5 calabazas, 
however. Magne went on to give a 9 
minute solution using 5 calabazas and as- 
suming that each one flows at the same 
rate (as in Jimmy Hu’s assumption). 
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MICROWAY DELIVERS SUPERCOMPUTER PERFORMANCE IN NT, LINUX AND UNIX WORKSTATIONS 


Since 1982, Microway’s 
products and technical sup- 
port have helped users get 
more done for less money. 
Starting with the concept that 
PCs could use more numeric 
power, we built a product line 
and customer base that is now 
worldwide. The mother- 
boards and workstations we 
design today use Pentium- 
and Alpha-based processors 
that deliver 20,000 times the 
throughput of the 8087s we 
started with in 1982. 

Microway has been build- 
ing Linux Beowulf clusters 
since 1997. Our users employ 
either PVM or MPI to manage 
communications between 
processors in clusters from 8 
to 200 Pentium or Alpha 
CPUs. We design systems 


transputers. Following this, 
Microway built small super- 
computers that featured up 
to 20 Intel i860 RISC proces- 
sors. The 667-MHz dual 
Alpha motherboard, which 
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we currently feature in our 
high-end workstations, deliv- 
ers 2.6 gigaflops of through- 
put. If you have an applica- 
tion that is a big-time num- 
ber-cruncher or a DSP appli- 
cation that needs 64 bits of 
precision, you should consid- 
er our solutions. 


667MHZ Alpha 21264 Daughtercard 


Dual Alpha 
21264DP 
Motherboard 





using 21264 dual Alpha 
motherboards, UP2000 dual 
motherboards, 21164-LX sin- 
gle Alpha motherboards, or 
Pentium processors for all 
price points. 

Our expertise in parallel 
processing dates back to the 
mid-80s when we_- were 
Inmos’s largest customer for 


Microway is known for giv- 
ing excellent service. When 
you call us, you talk to a com- 
petent person. Because we 
appreciate the critical nature 
of your work, every one of our 
products comes with free tech 
support for two years. Our leg- 
endary tech support makes it 
possible for us to configure 


your favorite True 64 UNIX, 
and OpenVMS systems, yet 
also deliver NT and Linux. 
And we know how to take care 
of special situations, including 
rack-mounted _industrial- 


Microway" 


grade systems, RAID-con- 
trolled hard disk farms, and 
high bandwidth interproces- 
sor Communications 
Microway’s current soft- 
ware product line is 
anchored by NDP Fortran, 
which is available for 
Pentiums and generates 
Alpha code for Linux. 
Compaq and Intel’s ten-year 
agreement insures that the 
Alpha 21264 and 21364 will 
continue to be performance 
leaders in the high-speed 
numerics market for years to 
come. Intel and Samsung will 
manufacture the Alpha, 
which Compaq engineers will 
design and market. This 
means that you can count on 
Microway to continue our 
tradition of designing state- 
of-the-art clusters, mother- 
boards, and workstations. 
Microway hardware 
products have always 
been popular with 
government, indus- 
try, and university re- 
searchers. Our i860 
powered cards were 
used to search for oil, 
improve MRI resolu- 
tion, do air flow stud- 
les on jet engines, 
and help the NASA 
SETI project search 


for extraterrestrial 
life. Microway high- 
end Alpha and 


Pentium workstations 
are currently in use 
throughout the US in 
major universities and 
research organizations 
like NASA, NIST, 
NIH, Lincoln Labora- 
tory, Smithsonian, 
and CDC. 


Company History 
Microway was 

founded in 1982 to 

help scientists and 





engineers take advantage of 
the IBM PC. Our first product 
was a library, which made it 
possible to use an 8087 in a 
PC. We bundled our libraries 
with 8087s and became one 
of Intel’s largest customers. 

Our hardware products 
included PC _ accelerators, 
coprocessor cards, and moth- 
erboards. In 1986, we intro- 
duced the first 32-bit Fortran 
to run on an Intel PC. The 
first PC to hit a megaflop 
used a Microway/Weitek 
coprocessor driven by NDP 
Fortran. Over the years, NDP 
Fortran has been used to port 
hundreds of popular main- 
frame applications, including 
MATLAB and ASPEN, to 
Intel-based PCs. 

Microway’s workstations 
have been purchased by uni- 
versity and NASA laboratories 
since 1989. PC Computing 
Magazine named our Alpha 
system “the fastest Windows 
NT workstation on the planet 
... the performance leader.” 


For more information, contact 
Microway, Inc., Research Park, 
Box 79, Kingston, MA 02364; 
Tel: 508-746-7341; Fax: 508- 
746-4678, e-mail: info@microway. 
com; www.microway.com 
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All This, and 


Frequent Flyer Miles, Too! 


Gregory V. Wilson 


hey say that the Internet has made 
geography irrelevant, but I’m not 
convinced. I put 65,000 miles on my 
frequent-flyer card last year. 

On the bright side, all that flying has 
given me a lot of time to read, and vari- 
ous publishers have done a good job of 
giving me things worth reading. At the top 
of the list are two very practical, and very 
useful, books: Mastering Algorithms with 
Perl, by Jon Orwant, Jarkko Hietaniemi, 
and John Macdonald, and Programming 
for the Java Virtual Machine, by Joshua 
Engel. 

Mastering Algorithms with Perl is a good 
complement to O'Reilly & Associates’ ear- 
lier Perl Cookbook, by Tom Christiansen 
and Nathan Perl Torkington CISBN 1- 
56592-243-3: reviewed in this column in 
February, 1999). Where the Cookbook 
showed how to solve particular small and 
medium-sized tasks, Mastering Algorithms 
is a guided tour of classic data structures 
and algorithms, ranging from linked lists 
to directed graphs, and from sorting and 
searching to number theory and crypto- 
graphy. The book is rich in code— hard- 
ly a page goes by without some Perl or a 
diagram to illustrate a point— but the au- 
thors have kept these snippets short 
enough to ensure comprehensibility. They 
are also careful to write readable Perl (no, 
that isn’t an oxymoron), so that novice or 
occasional users won't have to flip back 
and forth between this book and a Perl 
language reference. 

The excellent index will also help 
keep back-and-forthing to a minimum: 
In the month or so that I’ve had this 
book on my desk, the index has always 
taken me straight to what I was looking 
for. My only complaints are the lack of 
theoretical analysis after the first few 
chapters (I don’t usually care, but when 


Greg is the author of Practical Parallel Pro- 
gramming (MIT Press, 1995), and coedi- 
tor with Paul Lu of Parallel Programming 
Using C++ (MIT Press, 1996). Greg can 
be reached at gvwilson@interlog.com. 
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I do, it’s handy to have), and the off- 
hand way the authors excuse Perl’s use 
of reference counting instead of real 
garbage collection. Java is proof that 
modern garbage collection systems are 
fast enough for production use. It’s high 
time the authors of books such as this 
one stopped telling people to avoid cir- 
cular reference patterns (which cause 
memory leaks), and began admitting that 
Perl, and other scripting languages, have 
some ground to make up. 

This brings us to Joshua Engel’s Pro- 
gramming for the Java Virtual Machine. 
This book is not a reference guide to the 
JVM, although it can certainly serve as 
one. Instead, it describes how Java, and 
other languages, can be compiled to run 
on the JVM. If you want to know how 
calls to virtual and static methods differ, 
how exceptions are implemented, or 
what adding the synchronized keyword 
to a method actually does, then this is 
the book you’ve been looking for. What’s 
more, this book also describes how fea- 
tures from other languages can be made 
to run on the JVM. The cores of both 
Scheme and Prolog are covered, as are 
Sather-style iterators, parameterized types 
(like the templates of C++), and full- 
blown multiple inheritance. 

Like the Perl algorithms book, P/VM is 
well written, well edited, well illustrated, 
and has lots of pertinent code examples 
and a good index. If you are building 
compilers or tools that target the JVM, or 
just want a better understanding of what’s 
going on under the hood when your code 
starts to run, then this book is the place 
to start. 

Java’s teachability is one of its greatest 
strengths. However, that doesn’t make it 
the right language for every student. In par- 
ticular, it wasn’t designed for number 
crunching, and doesn’t have such basic nu- 
meric programming features as operator 
overloading or data-parallel array notation. 
Many science departments and engineer- 
ing schools are now teaching Java any- 
way, in part because its hypesters have 
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convinced everyone that it is the future, and 
in part because those hypesters are right. 

Stephen Chapman’s Java for Engineers 
and Scientists (JES) and Richard Davies’ 
Introductory Java for Scientists and En- 
gineers UIJSE) are aimed at different slices 
of this growing market. JES is a fairly con- 
ventional introduction to programming. 
It covers data representation, loops and 
conditionals, classes, and all the other 
machinery that any beginners’ text has 
to, in more or less the same order as oth- 
er beginners’ texts. It emphasizes nu- 
merics more than most such books, and 
does base some of its 
exercises on simple 
physics problems, 
but not enough to 
make it clearly better 
for a first-year engi- 
neering program- 
ming course than 
other books. 

IJSE, on the other 
hand, would be very 
hard to follow if you 
didn’t already know 
something about pro- 
gramming, but would 
be a better choice for 
a Fortran, C, or MATLAB programmer who 
wanted to change languages. (It actually 
has a chapter called “Java for C Program- 
mers.”) Davies sometimes takes three sen- 
tences to make a point that could as clear- 
ly be made in one, and, like Chapman, 
doesn’t go out of his way to point out that 
current- generation Java systems are much 
slower than their Fortran, C, or C++ equiv- 
alents. 

C++ is the topic of our next book, James 
Smith’s C++ Toolkit for Engineers and Sci- 
entists (CTES). | ordered my copy of this 
one based on the title, and while my first 
flip through the table of contents was pos- 
itive, my reaction to the book itself was 
more equivocal. The best thing about the 
book for me was its exposition of the de- 
sign of a medium-sized object-oriented 
framework for numerical computation, 
complete with I/O, error handling, and 
all the other incidental code that makes 
up half or more of any useful library. Few 
professional programmers, and even few- 
er scientists and engineers, have ever 
been shown how to do this, and I think 
many would profit from a careful read- 
ing of this book. 

However, I do have some reservations 
about the software in this book. C++ has 
a reputation for low performance, in part 
because compilers have not been able to 
optimize away the temporary variables 
created by overloaded operators. The tem- 
plate expression techniques developed by 
Todd Veldhuizen and others now enable 
Standard C++ to deliver the performance 
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Java’s teachability is 
one of its greatest 
strengths 


of optimized Fortran-77, without any sac- 
rifice in the level of abstraction. Unfortu- 
nately, Smith’s software doesn’t use these 
techniques. Like many others in this fast- 
moving industry, it has simply been over- 
taken by developments. Despite this, CTES 
might still be the best place for a scientist 
or engineer to learn how to build object- 
oriented libraries, at least until an equally 
lucid description of the MTL (http://www 
mpi.nd.edu/research/mtl/) or Blitz (http:// 
www.oonumerics.org/blitz/) comes along. 

The last pair of books on this month’s 
list examine two activities that are prob- 
ably equally foreign 
to true believers in 
UNIX: quantum com- 
puting and using Vi- 
sual Basic to drive Mi- 
crosoft Word. Cf I 
had to wager, I'd bet 
that my UNIX friends 
would start using the 
former long before 
they'd start using the lat- 
ter...) Michael Brooks, 
the editor and coauthor 
of Quantum Comput- 
ing and Communica- 
tions, is a physicist 
and science journalist. The slim book he 
has produced is a nontechnical intro- 
duction to the idea of using quantum un- 
certainty to speed up certain classes of 
computation, and to ensure secure com- 
munication. There are very few equa- 
tions, and some occasional gushing, but 
overall this is a good pop-sci look at 
something that might be computing’s 
next really big idea. 

The subject of Steve Roman’s Learn- 
ing Word Programming might not be 
anybody’s next big thing, but that 
doesn’t diminish its utility. Microsoft’s 
use of a single scripting language is a 
big step forward from UNIX’s reliance 
on shell scripts, Perl, Emac’s Lisp, and 
dozens of opaque configuration files. 
Most people think of Microsoft Office 
as a package for accountants and small 
businesses. In the hands of a competent 
VB programmer, however, Word, Excel, 
and other tools are more like a library 
of software components. Roman does 
spend time talking about how to use VB 
to write Word macros, but he also de- 
scribes how other VB applications can 
use Word to check spelling, format text, 
and so on. The book jumps around be- 
tween topics more than I would have 
liked, and there are places where I felt 
that Roman stopped half-way through 
an explanation, but this book is still a 
worthy entry in O’Reilly’s growing list 
of Microsoft titles. 


DDJ 


http://www.ddj.com 








The Advertiser Index 





OOOO EEE OO ERE RECO E OEE OSEEOU REESE OEE OCOTESEDESESEHEEEED HO OEOOEE —& BOEDEDENESEUEOEOESEREERUEEUHEE SOON OESEREUEE OSE SEOESEDESEEHEEDED 8 BOUDSDEDOHENEOEEEOEESESEEEEDE DSO DO DEANS CLEN ES ODeneeeainedsunens 


Advertisers Name Page # 
4thpass 117 
www.4thpass.com 
Abraxas Software, Inc 81 
www.abxsoft.com 
Academic Press 134 
www.mkp.com 
Addison Wesley 107 
www.awl.com/cseng/ 
Addison Wesley 122 
www.awl.com/cseng/ 
Aladdin Knowledge Systems 15 
www.aks.com 
American Cybernetics 75 
www.multiedit.com 
AICS 140 
Www.cs.aics.edu 
Amzi! Inc 140 
www.amzi.com 
Apachecon 2000 97 
www.apachecon.com 
Apress 136 
www.apress.com 
axs Tecnologies SA 14 
www.axs-tech.com 
Base One International Corp 139 
www.boic.com 
Bayside Computing 140 
www.baysidecomputing.com 
BE Inc 63 
www.be.com/drdobbs 
BitARTS, Ltd 56 
www.softlocx.com 
Blinkinc 113 
www.blinkinc.com 
Bumblebee Software 139 
www.bbeesoft.com 
Catenary Systems | 140 
www.catenary.com/victor 
Computer Associates 43 
A Www.cai.com 
| The Coriolis Group 125 


www.coriolis.com 


Compuware/NuMega Labs 95 
) Www.compuware.com/numega 


Cygnus Solutions 103 
www.cygnus.com 

Data Rescue 141 
www.datarescue.com 

Dell 20-21 
www.dell.com/smallbiz/dev 

Diab SDS fl 
www.diabsds.com 

Dinkumware Ltd 81 
www.dinkumware.com 

Dundas 33 
www.dundas.com 

DDJ Python 80 
www.ddj.com/cdrom/ 

DDJ CD-ROM Library 30-31 
www0.ddj.com/cdrom/ 

DDJJ Programmer’s Resources 138 
wwow.ddj.com 

Entrust Technologies 
www.devnet.entrust.com/ddj/ 
Esmertec, Inc 83 
www.esmertec.com 

FairCom 8 


www.faircom.com 


‘Aiveltivars Name 


Gemeric Logic 08 

: www.genlogic.com 

: Geodesic Systems, Inc. 48 

: www.geodisic.com 

: Gimpel Software 60 

: www.gimpel.com 

: Global Majic Software 34 : 
: www.3DLinX.com 

: Globetrotter Software 41 

: www.flexlm.com/winwin 

: Hi-Tech Software 76 : 


: www.htsoft.com 


Hunny Software 

: www.hunnysoft.com/MIMEPP 

: IAM Training and Consulting 128 

: www.iamx.com 

: ICEsoft AS 93 

: www.icesoft.com 

: InformIT 87 

: www.informit.com 

: InstallShield Software Corp 22 

: www.installshield.com 

: Intuititive Systems, Inc 38 

: www.optimizeit.com 

: Kenonic Controls Ltd 58 

: www.crypkey.com 

: Kenonic Controls Ltd 141 

: www.crypkey.com 

: KL Group Inc. 67 

: www.kigroup.com/widgets 
: KL Group Inc. 89 : 
: www.klgroup.com/freewidgets 

: Lead Technologies 106 

: www.leadtools.com 

: Microsoft C2-1, 2,3, 4 } 
: msdn.microsoft.com/windowsdna 

: Microsoft 45 : 
:msdn.microsoft.com 

: Microsoft AT: 
: msdn.microsoft.com/events 

: Microsoft 49 : 
: msdn.microsoft.com/training 
: Microsoft CE 85. 
: msdn.microsoft.com/windowsce/ 

: embedded/pb 

: Microsoft Press 99 : 
: WWW.mspress.microsoft.com 

: Microway 133 

: Www.microway.com 

: MKS 104 

: www.mkc.com/interop 

: National Instruments 90 

: www.ni.com/tools : 
: New Wave Software, Inc 141 : 
: www.nwspi.com 
: Northwoods Software 140 


: www.trulyvisual.com 


: Novell 03 
: www. developer.novell.com 


: Numerical Algorithms Group 

: WWW.nag.com 

: Objectivity, Inc 5: 
: Www. objectivity.com 

: Open Systems Resources 24 

: WWw.osr.com 
: Opus Software 64 : 


: WWw.opussoftware.com 





: WWw.perforce.com 


139 : 


: Sandstone Technology 92 : 
: Www.sand-stone.com 


SD Expo West 2000 
? wWww.sdexpo.com 


140 


Advertisers Name 


POPEKHH OEE ET ATO EO HOHE DEOEA HED HOHE HEHEHE EHEE EER HEHEDEHOHEEE 


: Parasoft Corp 13 | : Sequoia Software 51 
: www.parasoft.com/believe.htm : www.xmlindex.com/dd 

: Parasoft Corp 27 : SIGS Conferences 132 
www.parasoft.com/party.htm ; www.objectexpo.com 

: Perforce Software, Inc 19 : Softel vdm, Inc. 24 


: Wwww.softelvdm.com 


The Portland Group 69 Software Blacksmiths 140 
: WWW.pgroup.com : Wwww.swbs.com 
: Premia Corp 35, 37 : Softwired, Inc 46 
: WWw.premia.com/overview : WWW.javamessaging.com/ibus 

Premia Corp 39 : Softwrap 40 
: WWw.premia.com/docwrightnow : www.softwrap.com 
: Programmer’s Paradise 16-17 : Starbase 25 
: WWw.pparadise.com : Www.starbase.com 
: ProtoView 130 : Symantec Corp C4 
: Www.protoview.com : Www.visualcafe.com 
: PointBase, Inc 129 : Tall Tree 139 
: www.pointbase.com/devlic/ddj : www.tall-tree.com 
: Prentice Hall 91 : Teamshare, Inc 29 
: wWww.phptr.com : Wwww.teamshare.com 
: QNX 71 : TechExcel 55 
: Www.qnx.com : www.devtrack.com 
: Quadron 139 : Technetcast.com 74 
: Www.quadron.com : www.technetcast.com 
: Raima 73 : Tek Tools 127 
: www.raima.com/java : www.tek-tools.com 
: Reliable Software 140 : Tidestone Technologies, Inc 109 
: www.relisoft.com : www.tidestone.com 
: Research Systems 65 : Together Software 114 
: www.rsinc.com/share : www.togethersoft.com/dd 

Rogue Wave Software 9 : Transitive Dynamics, Inc 91, 93 
: Www.roguewave.com/ad/catch : www.TranDyne.com 
: Rogue Wave Software 11 : Walnut Creek CD-ROM 57 
: Www.roguewave.com/ad/studio : Www.slackware.com 

RSA Security C3 : Walnut Creek CD-ROM 59 
: www.rsasecurity.com/go/jumpstart : Www.wccdrom.com 
: International: www.rsasecurity.com/go/win : WIBU Systems 64 


www.wibu.com 


: Wind River Systems 7 
120 : www.windriver.com/html/toysdrd.htm 


: Zero G Software 117, 127 
: www.ZeroG.com 


SALES REPRESENTATIVES 


- Publisher / Timothy J. Trickett / 650-655-4201 / ttrickett@mti.com 
: Associate Publisher East / Brenner Fuller / 603-746-3057 / bfuller@mf.com 
: Sales Manager West / Stan Barnes / 650-655-4190 / sbarnes@mfi.com 


EW EN D/ MID-ATLA 


WEST SOUTHWEST NORTHWEST NORTHCENTRAL SOUTHEAST 


Account Manager Account Manager Account Manager Account Manager Account Manager 


Randy Byers Ron Cordek Michael Beasley Phil Marshall Michael Kelleher 
650-655-4301 949-574-0313 690-655-4304 978-499-4933 561-785-6322 
rbyers@mti.com — rcordek@mficom mbeasley@mfticom pmarshall@mficom mkelleher@mfi.com 
: Gales Associate AccountManager Account Manager Sales Associate Sales Associate 
Hassan Halevy Gabe Rogol Tom Hudner Marla Wood Linda Guyette 
650-655-4191 650-655-4168 650-655-4323 803-731-0759 603-924-5971 
: hhalevy@mficom  grogol@mficom _— thudner@mfi.com mwood@mficom  Iguyette@mticom 


SOFTWARE CAREERS Lee Somavia / 650-655-4193 / somavia@mficom 


FAX 650-358-9749 


The index on this page is provided as a service to our readers. 
The publisher does not assume any liability for errors or omissions. 





JN TRE WE 

























The editors of DD] have created these 

information-packed channels to showcase 
articles, reviews, FAQs, white papers, and 

more. Covering a variety of topics, the antl en 
Programmer’s Resources span the depth | ad oe Bias & =. meas 27. | 
of software development from algorithms : =| = | =e] 
to real-time computing. Dice EE ee 








wes Hefteran ang 
wae ives L sitnashiadad 


ane the same tor Man's Ler Ruie Halngwiies Raat 
: Zoi noe Revie we : ra ar eae 





’ 7 
in ong dee, h sauen 9 wt fis Of Sco 
Bet tite he ra AP f 


JAVA CHANNEL... 


The one-stop shop where Java developers can find | saeseaeee ae ee 
tips, techniques, and solutions for their programming needs. | , sree mee 
Frequent updates give you the bleeding-edge technical 
information you need.The Java Programmer’s Resource 
contains over 100 articles and white papers on Java 
optimization + Jini «+ JavaOS “ JavaBeans “+ Java 
security “ Java FAQs “ Java-based book reviews... 
and more! 








SPONSORED BY SUN MICROSYSTEMS 


BENCHMARKING AND SOFTWARE TESTING CHANNEL... 


A source of invaluable solutions, discussions, and techniques on two critical aspects of software development. 
This resource will give you articles and white papers on Software testing techniques « Test coverage 
analysis « quality of off-the-shelf software « FAQs “ suggested reading « related web sites... 
and much more! 





SPONSORED BY INTEL 


ALGORITHMS / Numerics and Y2K / C/C++ PROGRAMMING / Scientific Computing 
COMPUTER SECURITY / Scripting and Alternative Languages / PATTERNS AND OBJECT: 
ORIENTED DESIGN / Graphics Programming / COMMUNICATIONS AND NETWORKING / 
Distributed Computing / REAL-TIME COMPUTING 


VAT VAN YAre lel Mexelay a 


TOOLS FOR THE 
PROFESSIONAL 
PROGRAMMER 


Licensed by ... and around 
Fortune 500 companies ... | the world! 


document object model for MIME 

C++ library 

fully standards compliant (RFC 822, 2045, 2047, & more) 
SMTP, POP, NNTP 

source code available 


ry multi-thread safe 


Hunny Software (301) 948-6999 


Reach Buyers 
in Dr. Dobbs 
Marketplace! 





So many comm choices... 


So little time... 


“—" , 2, We re ee ee ere ee ie sur Bd “Ge | 
j “i a a eS aad Bs ae 5 te tie ee ag a : a ail 
7 4 f- 7” = fae oe See | ea 


Let Quadron help you simplify your life and solve your 
communication problems with our solutions. We've been 
helping people like you worldwide for over a decade. 


We offer a large array of communication cards that range 
from simple plug-and-play to custom solutions that 
precisely match your needs. We'll get you what works the 
first time. Check out our web site and call us real soon. 


V§ Quadron’ 


, 4 & www.quadron.com 
telephone 805-966-6424 « fax 805-966-7630 © email info@quadron.com 





©2000 Quadron Corporation 





Fix the Loopholes 
in C++ Arithmetic 


Conventional (++ arithmetic can result in unacceptable rounding errors and even destructive, high-order 


truncation. Now, you can easily replace numeric datatype declarations (double, int, unsigned long, etc.) 


with the Base/T Number Class and improve the accuracy of calculations in any C++ annlicntinn = 5 
Base/1 Number Class gives you: Ly¢ 


¢ Uncompromisingly exact decimal arithmetic 

© Up to 100 significant digits of precision with decimal point at any digit 
e Ftficient arithmetic and compact representation 

e Fully documented ANS! C++ source code and examples 


Guarantee numeric precision in all your applications. U.S. patent pending. 
Visit www.boic.com or call 212-691-7155 to order your copy now. 


Base One International Corporation 
E-mail info@boic.com @ Visit www.boic.com 


> New -Version 2.1. 
» Generates documen- 
tation. directly from the 
source code. 
pm Extracts comments. 
p> User customized 
reports formats. 
& HTML, WinHelp, 
RTF 


p FREE working 
evaluation at 
1-888-646-1933 www.bbeesoft.com 
Bumble Bee Software : 3 } 
P.O. Box 2007 K ole Wd 
Westford, MA 01886 p 


info@bbeesoft.com 






















~ Produce HTML, 
_ MSHelp, and 
_ MSWord 
- documentation from | 
-commentsin your | 
_. code - and you won’t i 
_ need to change your 
: commenting style! 









You can fine- 
une your output 
with DocJet’s 
WYSIWYG 
utput editor. 














\ 





http://www.tall-tree.com 
info@tall-tree.com 512-453-4909 


JIP/dJOyjICU 





MadarrAetpidce 





EE Yourself from the stress, 
strain and headaches 
that can accompany 
custom programming 
worries! 

NAG HPC 
Components & 
Visualization tools 


C/C++ 
SMP 
Parallel 
Fortran 





NT Windows 
UNIX 
Solaris 
Linux 


¢ Increase productivity * Reduce costs 
¢ Accelerate development ¢ Cutting edge 
algorithms * Quality documentation 


Email NAG today at info-ddj @nag.com 

and we will send you Free NAG Worrybeads 
and information on quality, dependable NAG 
HPC components to relieve your numeric 
headaches. 


NUMERICAL ALGORITHMS GROUP 
630-971-2337 Ni 





Make Your Web 
Site SMART 


Advise, solve, route and configure 
with the Amzi!® LogicServer™ tools & 
libraries (DLLs) for Web Servers, Java, 
C/C++, VB, Delphi & more. Win NT 
95/98 & Solaris. Use ODBC, Sockets, 
Unicode & OOP. Consulting Services. 


FREE Evaluation Version! . 
Auzi® ine. Email info@amzi.com ' 
Call +1-513-425-8050 Fax 425-8025 , 












PROGRAM FOR YOUR FUTURE 


Earn your B.S. or M.S. in Computer Science 
through distance education. 


* Prepare for one of the thousands of computer science jobs available 

* Increase your earning power 

* Study from home or office at your convenience 

* Approved by more than 330 companies 

* Follows ACM/IEEE guidelines 

* Courses teach leading industry languages: 
(++, Java, Visual Basic, and more 

For a free catalog call 1-800-767-AICS 

(2427) or visit cs.aics.edu 
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AND APPROVED 
Birmingham, AL 















NEW! 
Bim ~—sCVersion 2.15 


The versatile Version Control System for 
collaborative development 


® Synchronization using email, local network, floppy disk 
® Intuitive GUI -- check-in, check-out, synch, visual diff. 
® Fully functional trial version available for download 


SERVER-LESS VERSION CONTROL 
www.relisoft.com 
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C, fast and JAVA DOCUMENTATION (v. 8.0) 


Creates cross-reference 
of local/global/detine/parameter 
identifiers, class trees. 
\ GE All 5 
programs integrated as 1 overall 
C-DOC program. <10,000 lines. 
JavTREE graphic-tree viewer (Sfree in 
($59) Calculates path C-DOC). 
romnniiy counts lines with ” rofessional ($299) DOS, 
comments, code, 'C' statements Win95/NT, OS/2, 1,000,000+ lines. 

¢ C-LIS ) Lists and action- 

dene or vr reformats source into 

user-selected standard formats. 


SOFTWARE BLACKSMITHS INC. 


6064 St Ives Way, Mississauga — Voice/Fax (5 
ONT Canada L5N-4M1 http: Jinn. swbs. com 


i )) Graphic-tree of 
caller/called function hierarchy, cross- 
reference, file/function index. 


Creates/inserts/updates comment- 
blocks (functions/identifiers used) for 
py function. 
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e Program with 
pictures, not code! 


Use with ACT!, Word, 
Excel, Powerpoint, or 
any COM-enabled 
application! 
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Prayers 
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Sanscript ™- by Northwoods Software 


Image Processing Library 


Fast BMP, TIFF, PCX, GIF, TGA, PNG, JPEG. Adjust 
brightness, contrast, sharpen, create filters, resize, rotate, 
+more of single image, multiple images, or any image area; 
color reduction to optimum, specific, or std. palette; print; 
scan; crop, combine, compare, blend images. 


DOS $199, 16-bit DLL $299, 32-bit DLL $499 


Catenary Systems 
314-962-7833/fax: 314-962-8037 
www.catenary.com/victor 




















Bayside Computing, Inc: 
Developer Resources 


Beans, Applets, Embedded Info 





A Sun Microsystems Strategic Developer 
www.baysidecomputing.com 
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find 116,000 
professional 

software 

developers 
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DDJ subscribers are among 
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* Resource Encryption * Key File Encryption 

* Built-in online registration * Software Lock "Virusepe ec 


* Distribute via CD/DVD, Diskette, Internet | use 


The world is a nasty place 


* Built-in e-Commerce | 
Protect all your software —_—. Aen SEBVAE Saves 
for just pennies a copy! eo | ro! $ 


www.nwspi.com 800-920-9283 We help the clever guy survive 
Copyright © 1999 New Wave Software, Inc. All Rights Reserved. Patent Pend 


SPI is a trademark of New Wave Software, Inc. Data Rescue WWW. DATARESCU : .COM 





The Best Try-Before-Buy Copy 
Protection & Software Licensing 


| CrypKey is solid, software-only copy protection with a 

} broad range of features to control all aspects of your | | 

| product’s operation. You can offer free one-time trials, turn 

| demos into full multi-user network versions, or sell specific 
options by phone, fax or email. Sell usage by time, runs, 


features - the control is all there - easy to implement, and 
| even easier to support; and it’s not fooled by reinstallation 
| or back-dating. Imagine Internet distribution of your 

software with easy e-commerce features and automated 
| authorizations without needing specialized equipment! 











ack It O1 Dkr tHe en www.crypkey.com 
| St ceca | To order books in this iceaaaneen any book. Please nealeine hrs/365 ‘aps: tm BOOKS: 
=>: ooRsN NOW (266-5766) or (702) 258-3338 ask for ext. 1410 or visit us on the web at 
om order + 45 S&H (S250 each adl tem) fo: Booke Now, 44BE 6400 Sout Ste. F125 St 
The Virtual Bookstore™ ‘secr,utenr — 
| FOR THE PROFESSIONAL PROGRAMMER Ls . 
When programming is more <a T-shirts: 100% pre-shrunk cotton Hanes Beefy-T, in steel gray and = 
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logo and web address printed in white. Made in USA. $7.95 


Price Payment by: AmEx Visa Mastercard 0 
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Subtotal: 
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FairCom has released FairCcom ODBC 
Driver: c-tree 4.3 Edition, targeted to c-tree- 
based applications developed world-wide 
in the past 15 years. The FairCcom ODBC 
Driver is based on single-tier driver tech- 
nology that indicates all of the program 
logic necessary to handle requests from a 
front-end application contained within the 
driver itself, including a SQL interpreter. 
The ODBC-compliant application uses the 
FairCom driver to access c-tree data files 
via the traditional c-tree ISAM API. The 
FairCom ODBC Driver: c-tree 4.3 Edition 
is priced separately for end users and for 
FairCom developers who wish to deploy 
directly to their customers. c-tree Plus sells 
for $895.00. 

FairCom Corp. 

2100 Forum Boulevard, Suite C 

Columbia, MO 65203-5456 

800-234-8180 

http://www faircom.com/ 


Excelon Stylus, an Extensible Stylesheet 
Language (XSL) editing tool, has been re- 
leased by Object Design. Excelon Stylus 
enables XML to be transformed into HTML 
for presentation on the Web, and speeds 
the development of XSL stylesheets by in- 
tegrating XML data, stylesheet editing, and 
web page preview in a single visual en- 
vironment. eXcelon Stylus is available both 
as a standalone tool and as part of the tool 
set for ODI’s Excelon 2.0 e-business ap- 
plication development environment, which 
is a dynamically extensible platform for 
building and deploying XML-based eBusi- 
ness applications. Pricing for the Excelon 
Stylus starts at $199.00. 

Object Design 

25 Mall Road 

Burlington, MA 01803 

800-962-9620 

http://www.odi.com/ 


Indigo Rose has announced Setup Facto- 
ry 5.0, a visual development tool for soft- 
ware deployment and installation via 
diskette, CD-ROM, LAN, and the Internet. 
Setup Factory offers a solution for creat- 
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ing flexible installation systems without 
the need to learn a proprietary scripting 
language. There is built-in support for the 
most common development systems such 
as Visual Basic 5 and 6, ODBC, BDE, 
ADO, DAO, OLE, RDS, MSJET, Adobe Ac- 
robat, and QuickTime. The new Screen 
Manager and Dialog Gallery offer control 
over the look-and-feel of installation. Pre- 
made dialogs such as static and scrollable 
text, check boxes, radio buttons, edit box- 
es, and file browsers can be customized 
with graphics and rich text features such 
as multiple fonts, colors, styles, and sizes. 
Setup Factory is available for $295.00. 

Indigo Rose Corp. 

123 Bannatyne Avenue, Suite 230 

Winnipeg, MB 

Canada R3B OR3 

800-665-9668 

http://www. indigorose.com/ 


The Apache Software Foundation has an- 
nounced the formation of the xml.apache 
.org Project, an Open Source project for 
XML and XSL tools. The project’s purpose 
is to build a solid reference suite of ap- 
plications and libraries for managing XML. 
The xml.apache.org project includes col- 
laborators such as Bowstreet, DataChan- 
nel, Exoffice, IBM, Lotus Development, 
and Sun Microsystems— all of which are 
contributing technologies to the project. 
These technologies include XML4J and 
XML4C parsers from IBM; Sun’s Java Pro- 
ject X and XHTML parsers; Lotus XSL; 
Xpages from DataChannel; the FOP XSL 
formatter from James Tauber of Bowstreet; 
Cocoon from Stefano Mazzocchi and the 
Java-Apache community; and OpenXML 
and XSL:P, both from Exoffice. The pro- 
ject code and participation guidelines are 
available at the Foundation’s web site. 
The Apache Software Foundation 
1901 Munsey Drive 
Forest Hill, MD 21050-2747 
http://xml.apache.org/ 


Merant has announced PVCS Professional 
3.5 featuring advanced issue and change 
management capabilities. PVCS Professional 
includes a new conversion program to ease 
migration to Merant PVCS for e-business 
application development and web teams. 
PVCS Professional is a complete suite for 
organizing, managing, and automating ap- 
plication and web development, including 
PVCS Version Manager for version control, 
PVCS Tracker for issues and change man- 
agement, and PVCS Configuration Builder 
for build management. PVCS Professional 
3.5 delivers PVCS Metrics, which allows 
users to generate, schedule, and post pro- 
ject metrics to defined web pages; change 
history queries that create queries based 
on past events; and conditional notifica- 
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tion, which enables automatic e-mail or 
In-Tray notification based on the specific 
field values of a record. PVCS Profession- 
al 3.5 is available for $1199.00. 

Merant 

701 E. Middlefield Road 

Mountain View, CA 94043 

650-938-3700 

http://www.merant.com/ 


Sun Microsystems has unveiled Java Blend 
2.0 and Java Message Queue 1.0. Two Java 
tools that provide services to help reduce 
the cost and time required to create Inter- 
net applications. Java Blend 2.0 software 
is a data access tool that connects dis- 
tributed applications with multiple sources 
of data. Java Blend includes transparent 
persistence capabilities that allow devel- 
opers access to tables, attributes, and rows, 
without writing any JDBC or SQL code. It 
maps relational database tables into Java 
objects and eliminates impedance mis- 
match, the gap that occurs when mapping 
2D data with Java objects. Java Message 
Queue 1.0 enterprise messaging software 
provides a standard way for business ap- 
plications to communicate and exchange 
information. Its software allows develop- 
ers to focus on creating business logic, and 
is a full production implementation of the 
open-standard Java Message Service 1.0.1 
specification, which provides improvements 
over traditional process-oriented messag- 
ing systems. Java Message Queue software 
intelligently routes messages throughout 
the network for efficient bandwidth usage. 

Sun Microsystems Inc. 

901 San Antonio Road 

Palo Alto, CA 94303 

800-555-9786 

http://www.sun.com/ 


Tashcom Software has introduced ASPEdit 
2.05 for Windows, an Active Server Pages, 
scripting, and HTML editor. ASPEdit pro- 
vides full support for Visual Basic script, 
its code highlighter gives you a clear view 
on your code, and shows all HTML, ASP, 
and VBScript colorized so you can see 
what is code and what is text. ASPEdit also 
supports the WebTV tags, ColdFusion, and 
common SQL commands. ASPEdit is de- 
signed to run under Windows 95/98/NT. 
ASPEdit 2.05 costs $50.00 for a single user 
license; site licenses are available. 

Tashcom Software 

46 Highstreet, Hauxton 

CB2 5HW, Cambridgeshire UK 

01223 -873182 

http://www.aspedit.co.uk/ 


Rebol Technologies has introduced Rebol/ 
Command 1.0, the platform-independent 
Internet communications language, with ex- 
panded functions and features for corporate 
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enterprise applications. Applications de- 
veloped with Rebol/Command 1.0 can be 
integrated with platform-specific libraries, 
tools, applications, and databases, as part 
of e-commerce, e-mail, or site management 
messaging services. A complement or al- 
ternative to languages such as Java, C++, 
and Perl, Rebol provides a common data 
format for information exchange with In- 
ternet applications. Rebol/Core, the first 
version, is available as a free download. 
Expanded features of Rebol/Command 1.0 
include support for ODBC standard for 
database connectivity; support for calling 
third-party applications, platform-specific 
systems, or shell commands from within 
Rebol scripts; support for calling third- 
party DLLs and shared objects; and im- 
proved trace and debug capabilities. 

Rebol Technologies 

P.O. Box 1510 

Ukiah, CA 95482 

707-467-8000 

http://www.rebol.com/ 


ILOG has unveiled the ILOG JViews Com- 
ponent Suite for creating e-cockpits —win- 
dows into the supply chain that enable 
web-based collaboration of supply chain 
trading partners. This release combines 
JavaBeans, class libraries, and editor ap- 
plications. The features include an en- 
hanced graph layout package for visual- 
ization of work and process flows. Also 
new is a Gantt chart module for sharing 
resource-oriented and_ task-oriented 
scheduling over the Web. The JViews 
Components Suite also includes an en- 
hanced map display engine, connection 
to different back-end map servers, and 
Composer, a custom graphics editor. 
JViews runs on any Java platform and web 
browser that supports JDK 1.1, including 
Netscape Communicator and Microsoft Ex- 
plorer. Pricing starts at $6500.00 

ILOG 

1080 Linda Vista Avenue 

Mountain View, CA 94043 

650-567-8000 

http://www.ilog.com/ 


Zentropix has announced its Remote Ker- 
nel Module Step and Trace stub for GDB 
and the Remote Run-time Data Debugger 
(R2D2) as open-source developments un- 
der the Gnu Public License. These tools 
are for real-time Linux development. The 
R2D2 debugger addresses later develop- 
ment stages during which it is critical that 
the application is not stopped or stepped 
during the debug. This debugger provides 
nonintrusive, run-time symbolic access to 
parameters within user space and within 
the kernel and its modules, while the ap- 
plication continues to execute at its nor- 
mal iteration. This paradigm increases the 
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speed and simplicity of application tun- 
ing, integration, and verification. Both 
products are available on Zentropix’s val- 
idated RealTime Linux installation CD, 
Version 2.2. 

Zentropic Computing, LLC (Zentropix) 

441-B Carlisle Drive 

Herndon, VA 20170 

703-471-6690 

http://www.zentropix.com/ 


The Numerical Algorithms Group (NAG) 
and The Portland Group (PGI) have re- 
leased an optimized version of the NAG 
Fortran Library for Intel processor-based 
Linux workstations, servers, and clusters. 
Mark 18 of the NAG Fortran Library for 
Linux is built and validated using PGI’s 
Linux compiler, and is fully interoperable 
with the latest release of PGI’s Fortran, C, 
and C++ compilers, and tools for Linux. 
The NAG Library comprises reusable soft- 
ware components packaged to provide 
application developers with mathematical 
and statistical functionality. The Portland 
Group has also announced PGI Worksta- 
tion 3.1, the latest release of PGI’s suite 
of parallel Fortran, C, and C++ compilers 
and tools. PGI Workstation 3.1 is sup- 
ported on Intel processor-based work- 
stations, servers, and clusters running the 
Linux, Solaris86, and NT operating sys- 
tems. Pricing starts at $299.00 for F77-only 
or C/C++-only packages, and $499.00 for 
full FOO/HPF packages. 

The Numerical Algorithms Group 

1400 Opus Place, Suite 200 

Downers Grove, IL 60515 

630-971-2337 

http://www.nag.com/ 


The Portland Group Inc. 

9150 SW Pioneer Court, Suite H 
Wilsonville, OR 97070 
503-682-2806 
http://www.pgroup.com/ 


TestComposer, from CS Verilog, is a new 
module from the ObjectGeode product 
line. Test objectives can either be pro- 
vided by the user, with respect to func- 
tional requirements, or computed by Test- 
Composer, with respect to structural 
coverage. Test suites are currently gener- 
ated either in Message Sequence Charts 
(MSC) or in Tree and Tabular Combined 
Notation (TTCN) format. ObjectGeode is 
a toolset dedicated to analysis, design, 
verification, and validation through sim- 
ulation, code generation, and testing of 
real-time and distributed applications. 
ObjectGeode supports a coherent inte- 
gration of complementary object-oriented 
approaches based on standards: UML, 
SDL, and MSC. ObjectGeode provides 
graphical editors, a powerful interactive, 
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random, and exhaustive simulator, a 
C/C++ code generator targeting popular 
real-time OSs such as VxWorks, VRTX, 
pSOS+, Win32, Posix, and Chorus, and 
network protocols such as TCP/IP, and a 
design-level debugger. ObjectGEODE is 
also integrated with products such as 
QSS’s DOORS, WindRiver’s Tornado, and 
Rational’s Rose and ClearCase products. 

CS Verilog 

155 Villa Avenue, #7 

Los Gatos, CA 95032 

408-395-6367 

http://www.csverilog.com/ 


Frontline Systems has shipped Version 3.5 
of its Small-Scale Solver DLL software, 
aimed at application developers seeking 
to add optimization capabilities to their 
software. Version 3.5 of the Solver DLL 
line introduces features such as reentrant 
operation for multithreaded intranet and 
web server applications, an Evolutionary 
Solver engine, and an increase in capaci- 
ty for linear and quadratic problems from 
800 to 2000 variables. The Solver DLL 
products bring to user-written applications 
the optimization engines that form the 
core of the Microsoft Excel Solver, the 
Quattro Pro Optimizer, and the Lotus 1- 
2-3 Solver. Applications can use the Solver 
DLLs to automatically find the best way 
to allocate scarce resources such as mon- 
ey, raw materials, equipment, or people 
time, to maximize profits or minimize costs 
while operating within certain limits. A 
free evaluation version of the Solver DLL 
3.5 is now available for download from 
Frontline Systems’ web site. 

Frontline Systems Inc. 

P.O. Box 4288 

Incline Village, NV 89450 

775-831-0300 

http://www.frontsys.com/ 


Symantec has released the VisualCafé 4 
development environment for Java. This 
version of VisualCafé offers an adaptable, 
integrated application environment and 
takes full advantage of the new Java 2 
platform from Sun, including multiserver 
Enterprise JavaBeans support, Java Server- 
Pages, servlets, CORBA, and multitier dis- 
tributed debugging. All three versions of 
the VisualCafé 4 family— Standard, Ex- 
pert, and Enterprise — include the Light- 
ning JIT4 compiler, which provides Java 
2 execution. 

Symantec Corp. 

10201 Torre Avenue 

Cupertino, CA 95014 

888-822-3409 

http://www.symantec.com/ 
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SWAINE’S FLAMES 





Scaffolding 


y Cousin Corbett and I were building a wooden scaffold to stand on outside Stately Swaine 
Manor II while installing a window on the second floor. “I have a new venture,” Corbett told me 
as we Clambered up onto the first plank platform, laden with 2x4s to start building the second 
level of the scaffold. 

“Huh,” I commented. One reason that I have trouble getting a word in edgewise with Corbett is that 
he can talk while doing things like climbing a scaffold carrying an armload of 2x4s, or eating my food. 

“It’s a comparison shopping service comparison service.” 

“Huh?” I expostulated, laying down my load and taking several deep breaths. He proceeded to 
describe comparison shopping services while I did deep-breathing exercises. “There are different 
models,” he said, “Some can tell you while you’re examining a book at Amazon.com what it'll cost 
you at Borders, some watch your shopping habits and make personalized recommendations. Some 
accept money from sellers, and those you have to suspect of bias in favor of those sellers. So how 
does the poor consumer judge what shopping comparison service is the best?” 

“Your service?” I guessed, hammering 2x4s together like a mad thing. 

“That’s right; we compare the comparison services and rate them.” He had run out of 2x4s and 
reached down to pry one loose from the structure we were standing on. 

“By what criteria?” I asked, as he yanked loose another strut and handed it to me. 

“That’s proprietary.” 

“Ah, but who judges the judges?” 

I thought I had him with that pithy quote from Plato’s Republic, if that’s what it was a pithy quote 
from, but he surprised me. “We do, actually. We have a secondary service in which we rate 
comparison shopping service comparison services.” 

“Like yours.” 

“Like ours.” I watched with concern as he pried several more 2x4s out of level one. 

“So that would be a comparison shopping service comparison service comparison service?” I asked. 

“That’s right,” he said. “We compare all the comparison shopping service comparison services.” 

“Are there any others?” 

“No, we're alone in the field.” 

“The comparison isn’t worth much, then, Id think.” 

“You would be wrong,” he said. “It’s all explained by Peano’s truth convergence theorem. Under 
optimal circumstances, if a set $1 of sources assigns probabilities to a set SO of assertions, and a set S2 
of sources assigns probabilities to the accuracies of the $1 judgments and so forth, these probabilities 
of probabilities will converge to a vector of values that is a monotonic function of the truth values of 
the SO assertions. And Bretano’s corollary states that from this convergent vector of values the original 
truth values can be derived.” 

I thought about it. “So what you're saying is that as judges make judgments about judges’ 
judgments, they get closer and closer to the truth.” 

“Exactly.” 

“Under ‘optimal circumstances.” 

“Of course.” 

“Sounds like something for nothing to me.” I looked nervously at the nearly empty space beneath 
our platform where he had removed almost all the supporting 2x4s. 

“It’s just gainy communication. You know, there’s lossless and lossy compression. Gainy is the best. 
That’s where more information comes out of a transmission than went in.” 

“Is that possible?” 

“Sure. Like when I talk in my inept French to my French CFO and he tells me what I meant to say. 
He gets more out of the transmission than I put in. Here, grab the other end of the window.” 

We set the window in place. North to south it fit like Joshua, but there was a gap of about an inch- 
and-a-half above the frame. What it needed, I saw, was a 2x4 to plug the hole. 

As Corbett bent down and removed the last 2x4 from the first level of the scaffold, I shook my 
head. “Now how are we going to get down from here?” I asked, pointing out the shortsightedness of 
his policy of robbing level one to pay level two, as it were. There was nothing but six feet of air under 
the scaffolding on which we stood. 

He sighed, opened the window, stepped through, and disappeared into the house. 


HacbadD Sends 


Michael Swaine 
editor-at-large 
mswaine@swaine.com 
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- on Securing Your e-Business Applications. 


Connect to the power of RSA BSAFE e-security components. 


It seems like the whole world is out there racing to record e-business 
profits. So where does that leave you? If you find yourself stuck curb- 
side, while competitors leave you behind to sort through e-security 
issues, then RSA Security has the power boost you need — whether 
you develop apps for Web business, general client-server, remote 
management or embedded devices. 


RSA BSAFE components. Mega volts of security. Our RSA BSAFE 
e-security components give you all the tools you need to power up 
your e-business applications. For SSL in C, SSL for Java® technology 
and S/MIME, we get you rolling in a way those under-powered off- 
the-shelf browsers just can’t match. In one jolt, you'll get full-strength 
crypto (available worldwide*), more extensive sample code and 
protocol links for services such as Telnet, FTR NNTP SMTP and 

POP3. And, by combining SSL and S/MIME security protocols with 
the industry's most trusted cryptography, RSA BSAFE security gives 
you the jump you need to get to market faster. 


RSA BSAFE e-security. Power you control. Don’t be stalled by 
e-security issues. With RSA BSAFE security components you’re in 

charge of the implementation of our airtight security within your 
application. You control the interface, error recovery and the best 


A 


SECURED E 





balance of strength vs. accessibility for your needs. Best of all, you 
start off with the confidence that your e-security operates inde- 
pendently of changes to a browser. Your apps will always be 
completely interoperable. 


RSA Security. Rev up with the best. Connect with RSA Security 
and you work with the industry leader. We've been the driving force 
behind putting SSL and S/MIME on the map as industry standard 
security protocols. No one knows more about how to build secure 
applications, so you can count on us to provide tools and support 
you can use to meet your goals on time and on budget. 


Get a jump start on the competition. Qualify today for a free 
RSA BSAFE SSL-C, SSL-J or S/MIME-C software development 
kit" and get charged up. 


or 


SECURITY 
The Most Trusted Name in e-Security 


*Export or other restrictions may apply. RSA, BSAFE, the RSA Security and RSA Secured logos are trademarks or 
registered trademarks of RSA Security Inc. Java is a trademark of Sun Microsystems, Inc. © 1999 RSA Security Inc. 
All rights reserved. 
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